From guido at python.org Sat Oct 1 01:10:33 2011 From: guido at python.org (Guido van Rossum) Date: Fri, 30 Sep 2011 16:10:33 -0700 Subject: [Python-ideas] PEP 335 Revision 2 (Overloadable Boolean Operators) In-Reply-To: <4E8590A3.4040004@canterbury.ac.nz> References: <4E8590A3.4040004@canterbury.ac.nz> Message-ID: On Fri, Sep 30, 2011 at 2:49 AM, Greg Ewing wrote: > Here's a draft of an update to PEP 335. It includes a couple of > fully worked and tested examples, plus discussion of some > potential simplifications and ways to optimise the generated > bytecode. Thanks, I've checked this into the repo. I had to clean up some layout glitches, added today's date to the Post-History header, and set the Python-Version header to 3.3 (since it's being proposed for 3.3 at the earliest). Please use the attached, corrected copy if you edit it further so I won't have to clean it up again. :-) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- PEP: 335 Title: Overloadable Boolean Operators Version: $Revision$ Last-Modified: $Date$ Author: Gregory Ewing Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 29-Aug-2004 Python-Version: 3.3 Post-History: 05-Sep-2004, 30-Sep-2011 Abstract ======== This PEP proposes an extension to permit objects to define their own meanings for the boolean operators 'and', 'or' and 'not', and suggests an efficient strategy for implementation. A prototype of this implementation is available for download. Background ========== Python does not currently provide any '__xxx__' special methods corresponding to the 'and', 'or' and 'not' boolean operators. In the case of 'and' and 'or', the most likely reason is that these operators have short-circuiting semantics, i.e. the second operand is not evaluated if the result can be determined from the first operand. The usual technique of providing special methods for these operators therefore would not work. There is no such difficulty in the case of 'not', however, and it would be straightforward to provide a special method for this operator. The rest of this proposal will therefore concentrate mainly on providing a way to overload 'and' and 'or'. Motivation ========== There are many applications in which it is natural to provide custom meanings for Python operators, and in some of these, having boolean operators excluded from those able to be customised can be inconvenient. Examples include: 1. NumPy, in which almost all the operators are defined on arrays so as to perform the appropriate operation between corresponding elements, and return an array of the results. For consistency, one would expect a boolean operation between two arrays to return an array of booleans, but this is not currently possible. There is a precedent for an extension of this kind: comparison operators were originally restricted to returning boolean results, and rich comparisons were added so that comparisons of NumPy arrays could return arrays of booleans. 2. A symbolic algebra system, in which a Python expression is evaluated in an environment which results in it constructing a tree of objects corresponding to the structure of the expression. 3. A relational database interface, in which a Python expression is used to construct an SQL query. A workaround often suggested is to use the bitwise operators '&', '|' and '~' in place of 'and', 'or' and 'not', but this has some drawbacks. The precedence of these is different in relation to the other operators, and they may already be in use for other purposes (as in example 1). There is also the aesthetic consideration of forcing users to use something other than the most obvious syntax for what they are trying to express. This would be particularly acute in the case of example 3, considering that boolean operations are a staple of SQL queries. Rationale ========= The requirements for a successful solution to the problem of allowing boolean operators to be customised are: 1. In the default case (where there is no customisation), the existing short-circuiting semantics must be preserved. 2. There must not be any appreciable loss of speed in the default case. 3. Ideally, the customisation mechanism should allow the object to provide either short-circuiting or non-short-circuiting semantics, at its discretion. One obvious strategy, that has been previously suggested, is to pass into the special method the first argument and a function for evaluating the second argument. This would satisfy requirements 1 and 3, but not requirement 2, since it would incur the overhead of constructing a function object and possibly a Python function call on every boolean operation. Therefore, it will not be considered further here. The following section proposes a strategy that addresses all three requirements. A `prototype implementation`_ of this strategy is available for download. .. _prototype implementation: http://www.cosc.canterbury.ac.nz/~greg/python/obo//Python_OBO.tar.gz Specification ============= Special Methods --------------- At the Python level, objects may define the following special methods. =============== ================= ======================== Unary Binary, phase 1 Binary, phase 2 =============== ================= ======================== * __not__(self) * __and1__(self) * __and2__(self, other) * __or1__(self) * __or2__(self, other) * __rand2__(self, other) * __ror2__(self, other) =============== ================= ======================== The __not__ method, if defined, implements the 'not' operator. If it is not defined, or it returns NotImplemented, existing semantics are used. To permit short-circuiting, processing of the 'and' and 'or' operators is split into two phases. Phase 1 occurs after evaluation of the first operand but before the second. If the first operand defines the relevant phase 1 method, it is called with the first operand as argument. If that method can determine the result without needing the second operand, it returns the result, and further processing is skipped. If the phase 1 method determines that the second operand is needed, it returns the special value NeedOtherOperand. This triggers the evaluation of the second operand, and the calling of a relevant phase 2 method. During phase 2, the __and2__/__rand2__ and __or2__/__ror2__ method pairs work as for other binary operators. Processing falls back to existing semantics if at any stage a relevant special method is not found or returns NotImplemented. As a special case, if the first operand defines a phase 2 method but no corresponding phase 1 method, the second operand is always evaluated and the phase 2 method called. This allows an object which does not want short-circuiting semantics to simply implement the phase 2 methods and ignore phase 1. Bytecodes --------- The patch adds four new bytecodes, LOGICAL_AND_1, LOGICAL_AND_2, LOGICAL_OR_1 and LOGICAL_OR_2. As an example of their use, the bytecode generated for an 'and' expression looks like this:: . . . evaluate first operand LOGICAL_AND_1 L evaluate second operand LOGICAL_AND_2 L: . . . The LOGICAL_AND_1 bytecode performs phase 1 processing. If it determines that the second operand is needed, it leaves the first operand on the stack and continues with the following code. Otherwise it pops the first operand, pushes the result and branches to L. The LOGICAL_AND_2 bytecode performs phase 2 processing, popping both operands and pushing the result. Type Slots ---------- At the C level, the new special methods are manifested as five new slots in the type object. In the patch, they are added to the tp_as_number substructure, since this allows making use of some existing code for dealing with unary and binary operators. Their existence is signalled by a new type flag, Py_TPFLAGS_HAVE_BOOLEAN_OVERLOAD. The new type slots are:: unaryfunc nb_logical_not; unaryfunc nb_logical_and_1; unaryfunc nb_logical_or_1; binaryfunc nb_logical_and_2; binaryfunc nb_logical_or_2; Python/C API Functions ---------------------- There are also five new Python/C API functions corresponding to the new operations:: PyObject *PyObject_LogicalNot(PyObject *); PyObject *PyObject_LogicalAnd1(PyObject *); PyObject *PyObject_LogicalOr1(PyObject *); PyObject *PyObject_LogicalAnd2(PyObject *, PyObject *); Alternatives and Optimisations ============================== This section discusses some possible variations on the proposal, and ways in which the bytecode sequences generated for boolean expressions could be optimised. Reduced special method set -------------------------- For completeness, the full version of this proposal includes a mechanism for types to define their own customised short-circuiting behaviour. However, the full mechanism is not needed to address the main use cases put forward here, and it would be possible to define a simplified version that only includes the phase 2 methods. There would then only be 5 new special methods (__and2__, __rand2__, __or2__, __ror2__, __not__) with 3 associated type slots and 3 API functions. This simplified version could be expanded to the full version later if desired. Additional bytecodes -------------------- As defined here, the bytecode sequence for code that branches on the result of a boolean expression would be slightly longer than it currently is. For example, in Python 2.7, :: if a and b: statement1 else: statement2 generates :: LOAD_GLOBAL a POP_JUMP_IF_FALSE false_branch LOAD_GLOBAL b POP_JUMP_IF_FALSE false_branch JUMP_FORWARD end_branch false_branch: end_branch: Under this proposal as described so far, it would become something like :: LOAD_GLOBAL a LOGICAL_AND_1 test LOAD_GLOBAL b LOGICAL_AND_2 test: POP_JUMP_IF_FALSE false_branch JUMP_FORWARD end_branch false_branch: end_branch: This involves executing one extra bytecode in the short-circuiting case and two extra bytecodes in the non-short-circuiting case. However, by introducing extra bytecodes that combine the logical operations with testing and branching on the result, it can be reduced to the same number of bytecodes as the original: :: LOAD_GLOBAL a AND1_JUMP true_branch, false_branch LOAD_GLOBAL b AND2_JUMP_IF_FALSE false_branch true_branch: JUMP_FORWARD end_branch false_branch: end_branch: Here, AND1_JUMP performs phase 1 processing as above, and then examines the result. If there is a result, it is popped from the stack, its truth value is tested and a branch taken to one of two locations. Otherwise, the first operand is left on the stack and execution continues to the next bytecode. The AND2_JUMP_IF_FALSE bytecode performs phase 2 processing, pops the result and branches if it tests false For the 'or' operator, there would be corresponding OR1_JUMP and OR2_JUMP_IF_TRUE bytecodes. If the simplified version without phase 1 methods is used, then early exiting can only occur if the first operand is false for 'and' and true for 'or'. Consequently, the two-target AND1_JUMP and OR1_JUMP bytecodes can be replaced with AND1_JUMP_IF_FALSE and OR1_JUMP_IF_TRUE, these being ordinary branch instructions with only one target. Optimisation of 'not' --------------------- Recent versions of Python implement a simple optimisation in which branching on a negated boolean expression is implemented by reversing the sense of the branch, saving a UNARY_NOT opcode. Taking a strict view, this optimisation should no longer be performed, because the 'not' operator may be overridden to produce quite different results from usual. However, in typical use cases, it is not envisaged that expressions involving customised boolean operations will be used for branching -- it is much more likely that the result will be used in some other way. Therefore, it would probably do little harm to specify that the compiler is allowed to use the laws of boolean algebra to simplify any expression that appears directly in a boolean context. If this is inconvenient, the result can always be assigned to a temporary name first. This would allow the existing 'not' optimisation to remain, and would permit future extensions of it such as using De Morgan's laws to extend it deeper into the expression. Usage Examples ============== Example 1: NumPy Arrays ----------------------- :: #----------------------------------------------------------------- # # This example creates a subclass of numpy array to which # 'and', 'or' and 'not' can be applied, producing an array # of booleans. # #----------------------------------------------------------------- from numpy import array, ndarray class BArray(ndarray): def __str__(self): return "barray(%s)" % ndarray.__str__(self) def __and2__(self, other): return (self & other) def __or2__(self, other): return (self & other) def __not__(self): return (self == 0) def barray(*args, **kwds): return array(*args, **kwds).view(type = BArray) a0 = barray([0, 1, 2, 4]) a1 = barray([1, 2, 3, 4]) a2 = barray([5, 6, 3, 4]) a3 = barray([5, 1, 2, 4]) print "a0:", a0 print "a1:", a1 print "a2:", a2 print "a3:", a3 print "not a0:", not a0 print "a0 == a1 and a2 == a3:", a0 == a1 and a2 == a3 print "a0 == a1 or a2 == a3:", a0 == a1 or a2 == a3 Example 1 Output ---------------- :: a0: barray([0 1 2 4]) a1: barray([1 2 3 4]) a2: barray([5 6 3 4]) a3: barray([5 1 2 4]) not a0: barray([ True False False False]) a0 == a1 and a2 == a3: barray([False False False True]) a0 == a1 or a2 == a3: barray([False False False True]) Example 2: Database Queries --------------------------- :: #----------------------------------------------------------------- # # This example demonstrates the creation of a DSL for database # queries allowing 'and' and 'or' operators to be used to # formulate the query. # #----------------------------------------------------------------- class SQLNode(object): def __and2__(self, other): return SQLBinop("and", self, other) def __rand2__(self, other): return SQLBinop("and", other, self) def __eq__(self, other): return SQLBinop("=", self, other) class Table(SQLNode): def __init__(self, name): self.__tablename__ = name def __getattr__(self, name): return SQLAttr(self, name) def __sql__(self): return self.__tablename__ class SQLBinop(SQLNode): def __init__(self, op, opnd1, opnd2): self.op = op.upper() self.opnd1 = opnd1 self.opnd2 = opnd2 def __sql__(self): return "(%s %s %s)" % (sql(self.opnd1), self.op, sql(self.opnd2)) class SQLAttr(SQLNode): def __init__(self, table, name): self.table = table self.name = name def __sql__(self): return "%s.%s" % (sql(self.table), self.name) class SQLSelect(SQLNode): def __init__(self, targets): self.targets = targets self.where_clause = None def where(self, expr): self.where_clause = expr return self def __sql__(self): result = "SELECT %s" % ", ".join([sql(target) for target in self.targets]) if self.where_clause: result = "%s WHERE %s" % (result, sql(self.where_clause)) return result def sql(expr): if isinstance(expr, SQLNode): return expr.__sql__() elif isinstance(expr, str): return "'%s'" % expr.replace("'", "''") else: return str(expr) def select(*targets): return SQLSelect(targets) #-------------------------------------------------------------------------------- dishes = Table("dishes") customers = Table("customers") orders = Table("orders") query = select(customers.name, dishes.price, orders.amount).where( customers.cust_id == orders.cust_id and orders.dish_id == dishes.dish_id and dishes.name == "Spam, Eggs, Sausages and Spam") print repr(query) print sql(query) Example 2 Output ---------------- :: <__main__.SQLSelect object at 0x1cc830> SELECT customers.name, dishes.price, orders.amount WHERE (((customers.cust_id = orders.cust_id) AND (orders.dish_id = dishes.dish_id)) AND (dishes.name = 'Spam, Eggs, Sausages and Spam')) Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: From steve at pearwood.info Sat Oct 1 04:46:13 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 01 Oct 2011 12:46:13 +1000 Subject: [Python-ideas] startsin ? In-Reply-To: <4E85E9AC.8040401@whoosh.ca> References: <4E85E9AC.8040401@whoosh.ca> Message-ID: <4E867EF5.3080400@pearwood.info> Matt Chaput wrote: > On 30/09/2011 11:30 AM, Tarek Ziad? wrote: >> not sure how people do this, or if I missed something obvious in the >> stdlib, but I often have this pattern: > > str's interface is a bit cluttered with some questionable methods > ("captialize"? "center"? "swapcase"? "title"?) that probably should have > been functions in a text module instead of methods. The str methods started off as functions in the string module before becoming methods in Python 2.0. [steve at sylar python]$ python1.5 -c "'a'.upper()" Traceback (innermost last): File "", line 1, in ? AttributeError: 'string' object has no attribute 'upper' What you describe as "questionable methods" go back to the string module, and were made methods deliberately. And so they should be. > One thing is that the equivalent of .startsin() for .endswith() would be > .endsin(). In English, "ends in" is a variation of "ends with", e.g. > "What words end in 'a'?" [pedant] That may or may not be common in some dialects (although not any I'm familiar with, which isn't very many), but it isn't semantically correct. The Melbourne to Sydney marathon ends *in* Sydney because the place where it ends is *inside* Sydney; a pencil ends *with* a point because the end of the pencil *is* a point, it is NOT inside the point. Similarly, the word "race" ends *with* an 'e'. -- Steven From greg.ewing at canterbury.ac.nz Sat Oct 1 07:17:07 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 01 Oct 2011 18:17:07 +1300 Subject: [Python-ideas] startsin ? In-Reply-To: <4E85E9AC.8040401@whoosh.ca> References: <4E85E9AC.8040401@whoosh.ca> Message-ID: <4E86A253.9050603@canterbury.ac.nz> Matt Chaput wrote: > One thing is that the equivalent of .startsin() for .endswith() would be > .endsin(). In English, "ends in" is a variation of "ends with", e.g. > "What words end in 'a'?" Also "startsin" sounds a bit like beginning to do something naughty... -- Greg From stephen at xemacs.org Sat Oct 1 07:19:50 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 01 Oct 2011 14:19:50 +0900 Subject: [Python-ideas] startsin ? In-Reply-To: References: Message-ID: <878vp5gufd.fsf@uwakimon.sk.tsukuba.ac.jp> Mike Graham writes: > I wonder if it might be worthwhile to give any and all two-parameter API for > predicate functions, so that > > any(f, xs) > > is the same as > > any(f(x) for x in xs) > > This eliminates some really common boilerplate, but it adds complication and > has an ugly API. -1. The comprehension is not just boilerplate, it's documentation. From ericsnowcurrently at gmail.com Sat Oct 1 08:16:06 2011 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sat, 1 Oct 2011 00:16:06 -0600 Subject: [Python-ideas] Tweaking closures and lexical scoping to include the function being defined In-Reply-To: References: <4E82345A.3010105@canterbury.ac.nz> <1317344868.2369.32.camel@Gutsy> <1317353726.2369.146.camel@Gutsy> <1317360304.4082.22.camel@Gutsy> <20110930164535.GA2286@chopin.edu.pl> Message-ID: On Fri, Sep 30, 2011 at 12:14 PM, Nick Coghlan wrote: > However, I'm not seeing a lot to recommend that kind of syntax > over the post-arguments '[]' approach. Agreed, though Jan made some good points about this. But my concern with the decorator approach is that it isn't visually distinct enough from normal decorators. The @(...) form does help though. How about another approach: given: lock=threading.RLock()) def my_foo(): with lock: "do foo" or def my_foo(): with lock: "do foo" given: lock=threading.RLock()) It's an idea that just popped into my head (*cough* PEP 3150 *cough*). But seriously, I don't think statement local namespaces have come up at all in this super-mega thread, which surprises me (or I simply missed it). Doesn't it need a "killer app". Maybe this isn't "killer" enough, but it's something! :) -eric From xtian at babbageclunk.com Sat Oct 1 08:54:16 2011 From: xtian at babbageclunk.com (xtian) Date: Sat, 1 Oct 2011 07:54:16 +0100 Subject: [Python-ideas] startsin ? In-Reply-To: References: Message-ID: On Fri, Sep 30, 2011 at 6:57 PM, Mike Graham wrote: > On Fri, Sep 30, 2011 at 11:46 AM, David Stanek wrote: >> >> I tend to do something like this a lot; >> ? any(somestring.startswith(x) for x in starts) >> Probably enough that having a method would be nice. > > I wonder if it might be worthwhile to give any and all two-parameter API for > predicate functions, so that > > ??? any(f, xs) > > is the same as > > ??? any(f(x) for x in xs) > > This eliminates some really common boilerplate, but it adds complication and > has an ugly API. > If you don't like the comprehension you could always use any(map(f, xs)). From zuo at chopin.edu.pl Sat Oct 1 12:42:12 2011 From: zuo at chopin.edu.pl (Jan Kaliszewski) Date: Sat, 1 Oct 2011 12:42:12 +0200 Subject: [Python-ideas] startsin ? In-Reply-To: <4E86A253.9050603@canterbury.ac.nz> References: <4E85E9AC.8040401@whoosh.ca> <4E86A253.9050603@canterbury.ac.nz> Message-ID: <20111001104212.GA2969@chopin.edu.pl> Greg Ewing dixit (2011-10-01, 18:17): > Matt Chaput wrote: > > >One thing is that the equivalent of .startsin() for .endswith() > >would be .endsin(). In English, "ends in" is a variation of "ends > >with", e.g. "What words end in 'a'?" > > Also "startsin" sounds a bit like beginning to do > something naughty... Both naughty and trigonometry-related. I even fear to guess what it could be... *j From zuo at chopin.edu.pl Sat Oct 1 13:28:48 2011 From: zuo at chopin.edu.pl (Jan Kaliszewski) Date: Sat, 1 Oct 2011 13:28:48 +0200 Subject: [Python-ideas] strtr? (was startsin ? In-Reply-To: References: Message-ID: <20111001112848.GB2969@chopin.edu.pl> INADA Naoki dixit (2011-10-01, 01:18): > I think `strtr`_ in php is also very useful when escaping something. > > _ strtr: http://jp.php.net/manual/en/function.strtr.php > > For example: > > .. code-block:: php > > php> = strtr("foo\\\"bar\\'baz\\\\", array("\\\\"=>"\\", > '\\"'=>'"', "\\'"=>"'")); > "foo\"bar'baz\\" > > .. code-block:: python > > In [1]: "foo\\\"bar\\'baz\\\\".replace('\\"', '"').replace("\\'", > "'").replace('\\\\', '\\') > Out[1]: 'foo"bar\'baz\\' > > In Python, lookup of 'replace' method occurs many times and temporary > strings is created many times too. > It makes Python slower than php. For this particular case I'd use .decode('string_escape') for Py2.x str and .decode('unicode_escape') for Py2.x unicode strings. And in Py3.x I'd use... Er... In Py3.x the planned mechanism of str.transform()/untransform() (here: untransform('unicode_escape')) would be ideal -- but it has not been re-introduced yet: http://bugs.python.org/issue7475 -- and IMHO it should be as a clear way to do such transformations. Cheers. *j From zuo at chopin.edu.pl Sat Oct 1 13:44:28 2011 From: zuo at chopin.edu.pl (Jan Kaliszewski) Date: Sat, 1 Oct 2011 13:44:28 +0200 Subject: [Python-ideas] Tweaking closures and lexical scoping to include the function being defined In-Reply-To: <20110930213231.GB3996@chopin.edu.pl> References: <1317344868.2369.32.camel@Gutsy> <1317353726.2369.146.camel@Gutsy> <1317360304.4082.22.camel@Gutsy> <20110930164535.GA2286@chopin.edu.pl> <20110930213231.GB3996@chopin.edu.pl> Message-ID: <20111001114428.GA3428@chopin.edu.pl> Jan Kaliszewski dixit (2011-09-30, 23:32): > Nick Coghlan dixit (2011-09-30, 14:14): > > > If a "function state decorator" approach is used, then yeah, I agree > > it should come immediately before the main part of the function > > header. > > Yes, it seems to be a necessary requirement. > > > However, I'm not seeing a lot to recommend that kind of syntax > > over the post-arguments '[]' approach. > > IMHO @(...) has two advantages over '[]' approach: > > 1. It keeps all that 'additional scope'-stuff in a separate line, making > the code probably more clear visualy, and not making the crowd of > elements in the '...):' line even more dense (especially if we did use > annotations, e.g.: '...) -> "my annotation":'). > > 2. It is more consistent with the existing syntax (not introducing > '='-based syntax within []; [] are already used for two different > things: list literals and item lookup -- both somehow leading your > thoughts to sequence/container-related stuff). And also -- what maybe is even more important -- 3. Placing it before (and beyond) the whole def statement makes it easier to *explain* the syntax: @(x=1, lock=Lock()) def do_foo(y): nonlocal x with lock: x += y return x as being equivalent to: def _closure_provider(): x = 1 lock = Lock() def do_foo(y): nonlocal x with lock: x += y return x return do_foo do_foo = _closure_provider() del _closure_provider Cheers. *j From masklinn at masklinn.net Sat Oct 1 15:24:21 2011 From: masklinn at masklinn.net (Masklinn) Date: Sat, 1 Oct 2011 15:24:21 +0200 Subject: [Python-ideas] strtr? (was startsin ? In-Reply-To: <20111001112848.GB2969@chopin.edu.pl> References: <20111001112848.GB2969@chopin.edu.pl> Message-ID: <8195C93E-9F1A-48FE-BF33-8B01F768E2DC@masklinn.net> On 2011-10-01, at 13:28 , Jan Kaliszewski wrote: > For this particular case I'd use .decode('string_escape') for Py2.x str > and .decode('unicode_escape') for Py2.x unicode strings. > > And in Py3.x I'd use... Er... In Py3.x the planned mechanism of > str.transform()/untransform() (here: untransform('unicode_escape')) > would be ideal -- but it has not been re-introduced yet: > http://bugs.python.org/issue7475 -- and IMHO it should be as a clear way > to do such transformations. If push comes to shove, re.sub with a replacement function. From ron3200 at gmail.com Sat Oct 1 18:58:21 2011 From: ron3200 at gmail.com (Ron Adam) Date: Sat, 01 Oct 2011 11:58:21 -0500 Subject: [Python-ideas] Tweaking closures and lexical scoping to include the function being defined In-Reply-To: <20111001114428.GA3428@chopin.edu.pl> References: <1317344868.2369.32.camel@Gutsy> <1317353726.2369.146.camel@Gutsy> <1317360304.4082.22.camel@Gutsy> <20110930164535.GA2286@chopin.edu.pl> <20110930213231.GB3996@chopin.edu.pl> <20111001114428.GA3428@chopin.edu.pl> Message-ID: <1317488301.10154.47.camel@Gutsy> On Sat, 2011-10-01 at 13:44 +0200, Jan Kaliszewski wrote: > Jan Kaliszewski dixit (2011-09-30, 23:32): > > > Nick Coghlan dixit (2011-09-30, 14:14): > > > > > If a "function state decorator" approach is used, then yeah, I agree > > > it should come immediately before the main part of the function > > > header. > > > > Yes, it seems to be a necessary requirement. > > > > > However, I'm not seeing a lot to recommend that kind of syntax > > > over the post-arguments '[]' approach. > > > > IMHO @(...) has two advantages over '[]' approach: > > > > 1. It keeps all that 'additional scope'-stuff in a separate line, making > > the code probably more clear visualy, and not making the crowd of > > elements in the '...):' line even more dense (especially if we did use > > annotations, e.g.: '...) -> "my annotation":'). > > > > 2. It is more consistent with the existing syntax (not introducing > > '='-based syntax within []; [] are already used for two different > > things: list literals and item lookup -- both somehow leading your > > thoughts to sequence/container-related stuff). > > And also -- what maybe is even more important -- > > 3. Placing it before (and beyond) the whole def statement makes it > easier to *explain* the syntax: > > @(x=1, lock=Lock()) > def do_foo(y): > nonlocal x > with lock: > x += y > return x Supposedly the @ decorator syntax is supposed to be like a pre-compile substitution where.. @decorator def func(x):x Is equivalent to... def func(x):x func = decorator(func) IF that is true, it doesn't make the python core more complex. It would just be a source rewrite of the effected block in memory just before it's compiled. But it seems it's not implemented exactly that way. def deco(func): def _(f): return func(f) return _ @deco(foo) def foo(f): return f print(foo('python')) Results with ... NameError: name 'foo' is not defined def foo(f): return f foo = deco(foo) print(foo('python')) Prints "Python" and doesn't cause a name error. Is this a bug, or by design? I thought, if the @@ wrapper idea could be put in terms of a simple substitution template like the @ design, then it may be more doable. Cheers, Ron > as being equivalent to: > > def _closure_provider(): > x = 1 > lock = Lock() > def do_foo(y): > nonlocal x > with lock: > x += y > return x > return do_foo > do_foo = _closure_provider() > del _closure_provider From solipsis at pitrou.net Sat Oct 1 20:13:20 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 1 Oct 2011 20:13:20 +0200 Subject: [Python-ideas] __iter__ implies __contains__? Message-ID: <20111001201320.567074bb@pitrou.net> Hello, I honestly didn't know we exposed such semantics, and I'm wondering if the functionality is worth the astonishement: >>> "abc" in io.StringIO("abc\ndef\n") False >>> "abc\n" in io.StringIO("abc\ndef\n") True Basically, io.StringIO provides iteration (it yields lines of text) and containment is apparently inferred from that. Regards Antoine. From masklinn at masklinn.net Sat Oct 1 20:27:03 2011 From: masklinn at masklinn.net (Masklinn) Date: Sat, 1 Oct 2011 20:27:03 +0200 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <20111001201320.567074bb@pitrou.net> References: <20111001201320.567074bb@pitrou.net> Message-ID: <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> On 2011-10-01, at 20:13 , Antoine Pitrou wrote: > Hello, > > I honestly didn't know we exposed such semantics, and I'm wondering if > the functionality is worth the astonishement: > >>>> "abc" in io.StringIO("abc\ndef\n") > False >>>> "abc\n" in io.StringIO("abc\ndef\n") > True > > Basically, io.StringIO provides iteration (it yields lines of text) and > containment is apparently inferred from that. There's a second fallback: Python will also try to iterate using __getitem__ and integer indexes __iter__ is not defined. The resolution is defined in the expressions doc[0]: > For user-defined classes which define the __contains__() method, x in > y is true if and only if y.__contains__(x) is true. > > For user-defined classes which do not define __contains__() but do > define __iter__(), x in y is true if some value z with x == z is > produced while iterating over y. If an exception is raised during the > iteration, it is as if in raised that exception. > > Lastly, the old-style iteration protocol is tried: if a class defines > __getitem__(), x in y is true if and only if there is a non-negative > integer index i such that x == y[i], and all lower integer indices do > not raise IndexError exception. (If any other exception is raised, it > is as if in raised that exception). the latter kicks in any time an object with no __iter__ and a __getitem__ is tentatively iterated, I've made that error a few times with insufficiently defined dict-like objects finding themselves (rightly or wrongly) being iterated. I don't know if there's any way to make a class with an __iter__ or a __getitem__ respectively non-containing and non-iterable (apart from adding the method and raising an exception) [0] http://docs.python.org/reference/expressions.html#notin From solipsis at pitrou.net Sat Oct 1 20:33:35 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 1 Oct 2011 20:33:35 +0200 Subject: [Python-ideas] __iter__ implies __contains__? References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> Message-ID: <20111001203335.40b66fc9@pitrou.net> On Sat, 1 Oct 2011 20:27:03 +0200 Masklinn wrote: > On 2011-10-01, at 20:13 , Antoine Pitrou wrote: > > Hello, > > > > I honestly didn't know we exposed such semantics, and I'm wondering if > > the functionality is worth the astonishement: > > > >>>> "abc" in io.StringIO("abc\ndef\n") > > False > >>>> "abc\n" in io.StringIO("abc\ndef\n") > > True > > > > Basically, io.StringIO provides iteration (it yields lines of text) and > > containment is apparently inferred from that. > There's a second fallback: Python will also try to iterate using > __getitem__ and integer indexes __iter__ is not defined. > > The resolution is defined in the expressions doc[0]: > > > For user-defined classes which define the __contains__() method, x in > > y is true if and only if y.__contains__(x) is true. > > > > For user-defined classes which do not define __contains__() but do > > define __iter__(), x in y is true if some value z with x == z is > > produced while iterating over y. If an exception is raised during the > > iteration, it is as if in raised that exception. Ah, thanks for the pointer. I think we should add a custom IOBase.__contains__ raising a TypeError, then. The current semantics don't match what most people would expect from a "file containment" predicate. Regards Antoine. From stephen at xemacs.org Sat Oct 1 20:57:19 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 02 Oct 2011 03:57:19 +0900 Subject: [Python-ideas] Tweaking closures and lexical scoping to include the function being defined In-Reply-To: <1317488301.10154.47.camel@Gutsy> References: <1317344868.2369.32.camel@Gutsy> <1317353726.2369.146.camel@Gutsy> <1317360304.4082.22.camel@Gutsy> <20110930164535.GA2286@chopin.edu.pl> <20110930213231.GB3996@chopin.edu.pl> <20111001114428.GA3428@chopin.edu.pl> <1317488301.10154.47.camel@Gutsy> Message-ID: <877h4oh75c.fsf@uwakimon.sk.tsukuba.ac.jp> Ron Adam writes: > Supposedly the @ decorator syntax is supposed to be like a pre-compile > substitution where.. > > @decorator > def func(x):x > > Is equivalent to... > > def func(x):x > func = decorator(func) > > IF that is true, It is. > it doesn't make the python core more complex. It would > just be a source rewrite of the effected block in memory just before > it's compiled. > > But it seems it's not implemented exactly that way. > > def deco(func): > def _(f): > return func(f) > return _ > > @deco(foo) > def foo(f): > return f > > print(foo('python')) > > Results with ... NameError: name 'foo' is not defined This is different syntax, whose interpretation is not obvious from the simpler case. For example, by analogy to method syntax it could mean def foo(f): return f foo = deco(foo, foo) (where the first argument is implicit in the decorator syntax = the function being decorated, and the second is the explicit argument), in which case you would have gotten a "too few arguments" error instead, but it's doesn't. In fact, it is defined to be equivalent to bar = deco(foo) @bar def foo(f): return f Ie, bar = deco(foo) def foo(f): return foo foo = bar(foo) (which makes the reason for the error obvious), not > def foo(f): > return f > foo = deco(foo) I guess in theory you could think of moving the evaluation of deco(foo) after the def foo, but then the correct equivalent expression would be def foo(f): return f foo = deco(foo)(foo) which I suspect is likely to produce surprising behavior in many cases. From arnodel at gmail.com Sat Oct 1 21:27:12 2011 From: arnodel at gmail.com (Arnaud Delobelle) Date: Sat, 1 Oct 2011 20:27:12 +0100 Subject: [Python-ideas] Tweaking closures and lexical scoping to include the function being defined In-Reply-To: <877h4oh75c.fsf@uwakimon.sk.tsukuba.ac.jp> References: <1317344868.2369.32.camel@Gutsy> <1317353726.2369.146.camel@Gutsy> <1317360304.4082.22.camel@Gutsy> <20110930164535.GA2286@chopin.edu.pl> <20110930213231.GB3996@chopin.edu.pl> <20111001114428.GA3428@chopin.edu.pl> <1317488301.10154.47.camel@Gutsy> <877h4oh75c.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On 1 October 2011 19:57, Stephen J. Turnbull wrote: > Ron Adam writes: [...] > ?> But it seems it's not implemented exactly that way. > ?> > ?> def deco(func): > ?> ? ? def _(f): > ?> ? ? ? ? return func(f) > ?> ? ? return _ > ?> > ?> @deco(foo) > ?> def foo(f): > ?> ? ? return f > ?> > ?> print(foo('python')) > ?> > ?> Results with ... NameError: name 'foo' is not defined > [skip explanation that decorator is evaluated before function definition] > I guess in theory you could think of moving the evaluation of > deco(foo) after the def foo, but then the correct equivalent > expression would be > > def foo(f): > ? ?return f > foo = deco(foo)(foo) > > which I suspect is likely to produce surprising behavior in many > cases. Even this wouldn't work, because in reality (in CPython at least) the name 'foo' is only bound *once*, that is after the application of the decorator - as the example below illustrates: >>> def deco(f): return foo ... >>> @deco ... def foo(): pass ... Traceback (most recent call last): File "", line 1, in File "", line 1, in deco NameError: global name 'foo' is not defined Anyway, I've now lost track of how this relates to the subject of this thread :) -- Arnaud From taleinat at gmail.com Sat Oct 1 21:32:43 2011 From: taleinat at gmail.com (Tal Einat) Date: Sat, 1 Oct 2011 22:32:43 +0300 Subject: [Python-ideas] Proposal: add a calculator statistics module In-Reply-To: References: <4E6EAB33.50006@pearwood.info> Message-ID: On Tue, Sep 13, 2011 at 12:23 PM, Paul Moore wrote: > On 13 September 2011 05:06, Nick Coghlan wrote: > > On Tue, Sep 13, 2011 at 11:00 AM, Steven D'Aprano > wrote: > >> I propose adding a basic calculator statistics module to the standard > >> library, similar to the sorts of functions you would get on a scientific > >> calculator: > >> > >> mean (average) > >> variance (population and sample) > >> standard deviation (population and sample) > >> correlation coefficient > >> > >> and similar. I am volunteering to provide, and support, this module, > written > >> in pure Python so other implementations will be able to use it. > >> > >> Simple calculator-style statistics seem to me to be a fairly obvious > >> "battery" to be included, more useful in practice than some functions > >> already available such as factorial and the hyperbolic functions. > > > > And since some folks may not have seen it, Steven's proposal here is > > following up on a suggestion Raymond Hettinger posted to this last > > year: > > > > http://mail.python.org/pipermail/python-ideas/2010-October/008267.html > > > > >From my point of view, I'd make the following suggestions: > > > > 1. We should start very small (similar to the way itertools grew over > time) > > > > To me that means: > > mean, median, mode > > variance > > standard deviation > > > > Anything beyond that (including coroutine-style running calculations) > > is probably better left until 3.4. In the specific case of running > > calculations, this is to give us a chance to see how coroutine APIs > > are best written in a world where generators can return values as well > > as yielding them. Any APIs that would benefit from having access to > > running variants (such as being able to collect multiple statistics in > > a single pass) should also be postponed. > > > > Some more advanced algorithms could be included as recipes in the > > initial docs. The docs should also include pointers to more > > full-featured stats modules for reference when users needs outgrow the > > included batteries. > > > > 2. The 'math' module is not the place for this, a new, dedicated > > module is more appropriate. This is mainly due to the fact that the > > math module is focused primarily on binary floating point, while these > > algorithms should be neutral with regard to the specific numeric type > > involved. However, the practical issues with math being a builtin > > module are also a factor. > > > > There are many colours the naming bikeshed could be painted, but I'd > > be inclined to just call it 'statistics' ('statstools' is unwieldy, > > and other variants like 'stats', 'simplestats', 'statlib' and > > 'stats-tools' all exist on PyPI). Since the opportunity to just use > > the full word is there, we may as well take it. > > +1 (both on the Steven's original suggestion, and Nick's follow-up > comment). > > I like the suggestion of having a running calculation version, but > agree that it's probably a bit soon to decide on the best API for such > things. Recipes in the documentation would be a good start, though. > In the past few months I've done some work on "running calculations" in Python, and came up with a module I call RunningCalcs: http://pypi.python.org/pypi/RunningCalcs/ http://bitbucket.org/taleinat/runningcalcs/ It includes comprehensive tests and some benchmarks (in the wiki at BitBucket). If "running calculations" are to be considered for inclusion in the stdlib, I propose RunningCalcs as an example implementation. Note that implementing calculations in this manner makes performing several calculations on a single iterable very easy and potentially efficient. RunningCalcs includes implementations of a few calculations, including mean, variance and stdandard deviation, min & max, several summation algorithms and n-largest & n-smallest. Implementing a RunningCalc is simple and straight-forward. Usage is as follows: # feeding inputs directly to the RunningCalc instances, one input at a time mean_rc, stddev_rc = RunningMean(), RunningStdDev() for x in inputs: mean_rc.feed(x) stddev_rc.feed(x) mean, stddev = mean_rc.value, stddev_rc.value # easy & fast calculation using apply_in_parallel() a_i_p = apply_in_parallel mean, stddev = a_i_p(inputs, [RunningMean(), RunningStdDev()]) small5, large5 = a_i_p(inputs, [RunningNSmallest(5), RunningNLargest(5)]) Regarding co-routines: During development I considered using co-routine-generators; my implementation of Kahan summation still uses such a generator. I've found this isn't a good generic method for implementing "running calculations", mainly because such a generator must return the current value at each iteration, even though this value is usually not needed nearly so often. For example, implementing a running version of n-largest using a co-routine/generator would introduce a large overhead, whereas my version is as fast as _heapq.nlargest (which is implemented in C -- see benchmarks for details). - Tal Einat -------------- next part -------------- An HTML attachment was scrubbed... URL: From masklinn at masklinn.net Sat Oct 1 21:40:37 2011 From: masklinn at masklinn.net (Masklinn) Date: Sat, 1 Oct 2011 21:40:37 +0200 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <20111001203335.40b66fc9@pitrou.net> References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <20111001203335.40b66fc9@pitrou.net> Message-ID: <94F06486-B4E6-4AA9-B8C5-FC0119141B3C@masklinn.net> On 2011-10-01, at 20:33 , Antoine Pitrou wrote: > Ah, thanks for the pointer. > I think we should add a custom IOBase.__contains__ raising a TypeError, > then. The current semantics don't match what most people would expect > from a "file containment" predicate. I think it'd also be nice to add them to some "chopping block" list for Python 4: I've yet to see these fallbacks result in anything but pain and suffering, and they can be genuinely destructive and cause hard-to- track bug, especially with modern Python code's tendency to use (non-restartable) iterators and generators (send a generator to a function which seems to take them, it performs some sort of containment check before processing, the containment consumes the generator and proceeds to not do any processing?) From pyideas at rebertia.com Sat Oct 1 22:24:18 2011 From: pyideas at rebertia.com (Chris Rebert) Date: Sat, 1 Oct 2011 13:24:18 -0700 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> Message-ID: On Sat, Oct 1, 2011 at 11:27 AM, Masklinn wrote: > On 2011-10-01, at 20:13 , Antoine Pitrou wrote: >> Hello, >> >> I honestly didn't know we exposed such semantics, and I'm wondering if >> the functionality is worth the astonishement: >> >>>>> "abc" in io.StringIO("abc\ndef\n") >> False >>>>> "abc\n" in io.StringIO("abc\ndef\n") >> True >> >> Basically, io.StringIO provides iteration (it yields lines of text) and >> containment is apparently inferred from that. For comparison, it's interesting to note that collections.abc.{Iterable, Iterator} don't implement or require a __contains__() method. > There's a second fallback: Python will also try to iterate using > __getitem__ and integer indexes __iter__ is not defined. Again, for comparison, collections.abc.Sequence /does/ define default __contains__() and __iter__() methods, in terms of __len__() and __getitem__(). > the latter kicks in any time an object with no __iter__ and a __getitem__ > is tentatively iterated, I've made that error a few times with > insufficiently defined dict-like objects finding themselves (rightly > or wrongly) being iterated. Requiring the explicit marking of a class as a sequence by inheriting from the Sequence ABC in order to get such default behavior "for free" seems quite reasonable. And having containment defined by default on potentially-infinite iterators seems unwise. +1 on the suggested removals. Cheers, Chris From ron3200 at gmail.com Sat Oct 1 22:36:35 2011 From: ron3200 at gmail.com (Ron Adam) Date: Sat, 01 Oct 2011 15:36:35 -0500 Subject: [Python-ideas] Tweaking closures and lexical scoping to include the function being defined In-Reply-To: <877h4oh75c.fsf@uwakimon.sk.tsukuba.ac.jp> References: <1317344868.2369.32.camel@Gutsy> <1317353726.2369.146.camel@Gutsy> <1317360304.4082.22.camel@Gutsy> <20110930164535.GA2286@chopin.edu.pl> <20110930213231.GB3996@chopin.edu.pl> <20111001114428.GA3428@chopin.edu.pl> <1317488301.10154.47.camel@Gutsy> <877h4oh75c.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <1317501395.10536.53.camel@Gutsy> On Sun, 2011-10-02 at 03:57 +0900, Stephen J. Turnbull wrote: > In fact, it is defined to be equivalent to > > bar = deco(foo) > @bar > def foo(f): > return f > > Ie, > > bar = deco(foo) > def foo(f): > return foo > foo = bar(foo) > > (which makes the reason for the error obvious), not > > > def foo(f): > > return f > > foo = deco(foo) > > I guess in theory you could think of moving the evaluation of > deco(foo) after the def foo, but then the correct equivalent > expression would be > > def foo(f): > return f > foo = deco(foo)(foo) > > which I suspect is likely to produce surprising behavior in many > cases. Actually, this is what I though it was suppose to be. My example wasn't very good as it returned valid results in more than one path depending on how it was called. (not intentionally) Lets try a clearer example. def supply_y(y): def _(func): def wrapper(x): return func(x, y) return wrapper return _ @supply_y(2) def add(x, y): return x + y Pep 318 says it should translate to ... def add(x, y): return x + y add = supply_y(2)(add) add(3) ---> 5 # call wrapper with x, which calls add with (x, y) But the behavior is as you suggest... _temp = suppy_y(2) def add(x, y): return x + y add = _temp(add) add(3) ---> 5 This isn't what Pep 318 says it should be. Is this intentional, and does Pep 318 need updating? Or an unintended implementation detail? If it wasn't intended, then what? Cheers, Ron From arnodel at gmail.com Sat Oct 1 23:03:10 2011 From: arnodel at gmail.com (Arnaud Delobelle) Date: Sat, 1 Oct 2011 22:03:10 +0100 Subject: [Python-ideas] Tweaking closures and lexical scoping to include the function being defined In-Reply-To: <1317501395.10536.53.camel@Gutsy> References: <1317344868.2369.32.camel@Gutsy> <1317353726.2369.146.camel@Gutsy> <1317360304.4082.22.camel@Gutsy> <20110930164535.GA2286@chopin.edu.pl> <20110930213231.GB3996@chopin.edu.pl> <20111001114428.GA3428@chopin.edu.pl> <1317488301.10154.47.camel@Gutsy> <877h4oh75c.fsf@uwakimon.sk.tsukuba.ac.jp> <1317501395.10536.53.camel@Gutsy> Message-ID: On 1 October 2011 21:36, Ron Adam wrote: [...] > def supply_y(y): > ? ?def _(func): > ? ? ? ?def wrapper(x): > ? ? ? ? ? ?return func(x, y) > ? ? ? ?return wrapper > ? ?return _ > > @supply_y(2) > def add(x, y): > ? ?return x + y > > > Pep 318 says it should translate to ... > > def add(x, y): > ? ?return x + y > add = supply_y(2)(add) > > add(3) ?---> 5 ? ? ?# call wrapper with x, which calls add with (x, y) > > > But the behavior is as you suggest... > > _temp = suppy_y(2) > def add(x, y): > ? ?return x + y > add = _temp(add) > > add(3) ---> 5 > > > This isn't what Pep 318 says it should be. > > Is this intentional, and does Pep 318 need updating? > > Or an unintended implementation detail? > > If it wasn't intended, then what? I don't know if it is intentional, but it is probably a consequence of the fact that the CALL_FUNCTION opcode expects the function below its arguments in the stack. So the most concise way of implementing @ def f(x): .... is to: 1. evaluate an push the result on the stack 2. create function f and push it on the stack 3. execute CALL_FUNCTION 4. store the result in 'f' If you swap steps 1 and 2, then you need to add a new step: 2.5 execute ROT_TWO (swap TOS and TOS1) -- Arnaud From ron3200 at gmail.com Sat Oct 1 23:15:31 2011 From: ron3200 at gmail.com (Ron Adam) Date: Sat, 01 Oct 2011 16:15:31 -0500 Subject: [Python-ideas] Tweaking closures and lexical scoping to include the function being defined In-Reply-To: References: <1317344868.2369.32.camel@Gutsy> <1317353726.2369.146.camel@Gutsy> <1317360304.4082.22.camel@Gutsy> <20110930164535.GA2286@chopin.edu.pl> <20110930213231.GB3996@chopin.edu.pl> <20111001114428.GA3428@chopin.edu.pl> <1317488301.10154.47.camel@Gutsy> <877h4oh75c.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <1317503731.11160.33.camel@Gutsy> On Sat, 2011-10-01 at 20:27 +0100, Arnaud Delobelle wrote: > Anyway, I've now lost track of how this relates to the subject of this thread :) If a decorator can take the function name it is decorating, then Nicks example of using a private name for recursion becomes easier to do. According to pep316, it should be possible, but it's not implemented the way the pep describes. Also, if decorators are, or can be, implemented as a before compile time, template translation, then these 'sugar' features wont make the underlying core more complicated or complex. And finally, decorator type 'template' solution may also work to create closures. (Without making the core more complex.) Cheers, Ron From ncoghlan at gmail.com Sun Oct 2 03:01:08 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 1 Oct 2011 21:01:08 -0400 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> Message-ID: On Sat, Oct 1, 2011 at 4:24 PM, Chris Rebert wrote: > Requiring the explicit marking of a class as a sequence by inheriting > from the Sequence ABC in order to get such default behavior "for free" > seems quite reasonable. And having containment defined by default on > potentially-infinite iterators seems unwise. +1 on the suggested > removals. -1 to any removals - fallback protocols are the heart of duck-typing and the sequence of checks here is simply the longstanding one of permitting containment tests on any iterable by default, and providing default iterators for sequences that don't provide their own. However, +1 for adding an IOBase __contains__ that raises TypeError. This will need to go through the DeprecationError dance, though (i.e. for 3.3, issue the warning before falling back on the current iteration semantics). The current semantics are strange, but it's well within the realm of possibility for someone to be relying on them. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From steve at pearwood.info Sun Oct 2 03:14:01 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 02 Oct 2011 12:14:01 +1100 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> Message-ID: <4E87BAD9.2070501@pearwood.info> Chris Rebert wrote: > >> the latter kicks in any time an object with no __iter__ and a __getitem__ >> is tentatively iterated, I've made that error a few times with >> insufficiently defined dict-like objects finding themselves (rightly >> or wrongly) being iterated. > > Requiring the explicit marking of a class as a sequence by inheriting > from the Sequence ABC in order to get such default behavior "for free" > seems quite reasonable. And having containment defined by default on > potentially-infinite iterators seems unwise. +1 on the suggested > removals. These changes don't sound even close to reasonable to me. It seems to me that the OP is making a distinction that doesn't exist. If you can write this: x = collection[0]; do_something_with(x) x = collection[1]; do_something_with(x) x = collection[2]; do_something_with(x) # ... etc. then you can write it in a loop by hand: i = -1 try: while True: i += 1 x = collection[i] do_something_with(x) except IndexError: pass But that's just a for-loop in disguise. The for-loop protocol goes all the way back to Python 1.5 and surely even older. You should, and can, be able to write this: for x in collection: do_something_with(x) Requiring collection to explicitly inherit from a Sequence ABC breaks duck typing and is anti-Pythonic. I can't comprehend a use-case where manually extracting collection[i] for sequential values of i should succeed, but doing it in a for-loop should fail. But if you have such a use-case, feel free to define __iter__ to raise an exception. Since iteration over elements is at the heart of containment tests, the same reasoning applies to __contains__. -- Steven From pyideas at rebertia.com Sun Oct 2 03:21:04 2011 From: pyideas at rebertia.com (Chris Rebert) Date: Sat, 1 Oct 2011 18:21:04 -0700 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <4E87BAD9.2070501@pearwood.info> References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> Message-ID: On Sat, Oct 1, 2011 at 6:14 PM, Steven D'Aprano wrote: > Requiring collection to explicitly inherit from a Sequence ABC breaks duck > typing and is anti-Pythonic. Actually, my suggestion was just that Sequence is one possible (but clean) way to obtain the behavior; one could *of course* reimplement the functionality without recourse to Sequence if they desired. Cheers, Chris From ncoghlan at gmail.com Sun Oct 2 04:11:46 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 1 Oct 2011 22:11:46 -0400 Subject: [Python-ideas] Tweaking closures and lexical scoping to include the function being defined In-Reply-To: <877h4oh75c.fsf@uwakimon.sk.tsukuba.ac.jp> References: <1317344868.2369.32.camel@Gutsy> <1317353726.2369.146.camel@Gutsy> <1317360304.4082.22.camel@Gutsy> <20110930164535.GA2286@chopin.edu.pl> <20110930213231.GB3996@chopin.edu.pl> <20111001114428.GA3428@chopin.edu.pl> <1317488301.10154.47.camel@Gutsy> <877h4oh75c.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Sat, Oct 1, 2011 at 2:57 PM, Stephen J. Turnbull wrote: > Ron Adam writes: > > ?> Supposedly the @ decorator syntax is supposed to be like a pre-compile > ?> substitution where.. > ?> > ?> ? ? @decorator > ?> ? ? def func(x):x > ?> > ?> Is equivalent to... > ?> > ?> ? ? def func(x):x > ?> ? ? func = decorator(func) > ?> > ?> IF that is true, > > It is. It isn't quite - the name binding doesn't happen until *after* the decorator chain has been invoked, so the function is anonymous while the decorators are executing. In addition, the decorator expressions themselves are evaluated before the function is defined. That's not a problem in practice, since each decorator gets passed the result of the previous one (or the original function for the innermost decorator). The real equivalent code is more like: = decorator def (x): return x func = () (IIRC, the anonymous references happen to be stored on the frame stack in CPython, but that's an implementation detail) As far as the proposed semantics for any new syntax to eliminate the desire to use the default argument hack goes, I haven't actually heard any complaints about any addition being syntactic sugar for the following closure idiom: def (): NAME = EXPR def FUNC(ARGLIST): """DOC""" nonlocal NAME BODY return FUNC FUNC = () The debate focuses on whether or not there is any possible shorthand spelling for those semantics that successfully negotiates the Zen of Python: Beautiful is better than ugly. - the default argument hack is actually quite neat and tidy if you know what it means. Whatever we do should be at least as attractive as that approach. Explicit is better than implicit. - the line between the default argument hack and normal default arguments is blurry. New syntax would fix that. Simple is better than complex. - lexical scoping left simple behind years ago ;) Complex is better than complicated. - IMO, the default argument hack is complicated, since it abuses a tool meant for something else, whereas function state variables would be just another tier in the already complex scoping spectrum from locals through lexical scoping to module globals and builtins (with function state variables slotting in neatly between ordinary locals and lexically scoped nonlocals). Flat is better than nested. - There's a lot of visual nesting going on if you spell out these semantics as a closure or as a class. The appeal of the default argument hack largely lies in its ability to flatten that out into state storage on the function object itself Sparse is better than dense. - This would be the main argument for having something before the header line (decorator style) rather than cramming yet more information into the header line itself. However, it's also an argument against decorator-style syntax, since that is quite heavy on the page (due to the business of the '@' symbol in most fonts) Readability counts. - The class and closure solutions are not readable - that's the big reason people opt for the default argument hack when it applies. It remains to be seen if we can come up with dedicated syntax that is at least as readable as the default argument hack itself. Special cases aren't special enough to break the rules. - I think this is the heart of what killed the "inside the function scope" variants for me. They're *too* magical and different from the way other code at function scope works to be a good fit. Although practicality beats purity. - Using the default argument hack in the first place is the epitome of this :) Errors should never pass silently. Unless explicitly silenced. - This is why 'nonlocal x', where x is not defined in a lexical scope, is, and will remain, a Syntax Error and why nonlocal and global declarations that conflict with the parameter list are also errors. Similar constraints would be placed on any new syntax dedicated to function state variables. In the face of ambiguity, refuse the temptation to guess. - In the case of name bindings, the compiler doesn't actually *guess* anything - name bindings create local variables, unless overridden by some other piece of syntax (i.e. a nonlocal or global declaration). This may, of course, look like guessing to developers that don't understand the scoping rules yet. The challenge for function state variables is coming up with a similarly unambiguous syntax that still allows them to be given an initial state at function definition time. There should be one-- and preferably only one --obvious way to do it. Although that way may not be obvious at first unless you're Dutch. - For me, these two are about coming up with a syntax that is easy to *remember* once you know it, even if you have to look up what it means the first time you encounter. Others set the bar higher and want developers to have a reasonable chance of *guessing* what it means without actually reading the documentation for the new feature. I think the latter goal is unattainable and hence not a useful standard. However, I'll also note that the default argument hack itself does meet *my* interpretation of this guideline (if you know it, recognising it and remembering it aren't particularly difficult) Now is better than never. Although never is often better than *right* now. - The status quo has served us well for a long time. If someone can come up with an elegant syntax, great, let's pursue it. Otherwise, this whole issue really isn't that important in the grand scheme of things (although a PEP to capture the current 'state of the art' thinking on the topic would still be nice - I believe Jan and Eric still plan to get to that once the discussion dies down again) If the implementation is hard to explain, it's a bad idea. If the implementation is easy to explain, it may be a good idea. - this is where I think the specific proposal to just add syntactic sugar for a particular usage of an ordinary closure is a significant improvement on past attempts. Anyone that understands how closures work will understand the meaning of the new syntax, just as anyone that fully understands 'yield' can understand PEP 380's 'yield from'. Namespaces are one honking great idea -- let's do more of those! - closures are just another form of namespace, even though people typically think of classes, modules and packages when contemplating this precept. "Function state variables" would be formalising the namespace where default argument values live (albeit anonymously) and making it available for programmatic use. Despite its flaws, the simple brackets enclosed list after the function parameter list is still my current favourite: def global_counter(x) [n=0, lock=Lock()]: with lock: n += 1 yield n It just composes more nicely with decorators than the main alternative still standing and is less prone to overwhelming the function name with extraneous implementation details: @contextmanager def counted() [active=collections.Counter(), lock=threading.RLock()]: with lock: active[threading.current_thread().ident] += 1 yield active with lock: active[threading.current_thread().ident] -= 1 far more clearly conveys "this defines a context manager named 'counted'" than the following does: @contextmanager @(active=collections.Counter(), lock=threading.RLock()) def counted(): with lock: active[threading.current_thread().ident] += 1 yield active with lock: active[threading.current_thread().ident] -= 1 Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun Oct 2 04:30:21 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 1 Oct 2011 22:30:21 -0400 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> Message-ID: On Sat, Oct 1, 2011 at 9:21 PM, Chris Rebert wrote: > Actually, my suggestion was just that Sequence is one possible (but > clean) way to obtain the behavior; one could *of course* reimplement > the functionality without recourse to Sequence if they desired. But why would that would forcing everyone implementation standard sequences to reimplement the wheel be an improvement over the status quo? It would be like removing the check for "x.__len__() == 0" from boolean conversions. Duck-typed fallback protocols mean that you get some behaviour for free without inheriting from anything in particular. If there's a protocol that's a problem in a given case, then disable it or *just don't use it* (e.g. people don't *do* containment tests on infinite iterators, or, if they do, they quickly learn that triggering infinite loops is a bad idea). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From raymond.hettinger at gmail.com Sun Oct 2 07:13:29 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 2 Oct 2011 01:13:29 -0400 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <20111001201320.567074bb@pitrou.net> References: <20111001201320.567074bb@pitrou.net> Message-ID: <8A182379-B81C-4F43-810B-E139DA843E88@gmail.com> On Oct 1, 2011, at 2:13 PM, Antoine Pitrou wrote: > I honestly didn't know we exposed such semantics, and I'm wondering if > the functionality is worth the astonishement: Since both __iter__ and __contains__ are deeply tied to "in-ness", it isn't really astonishing that they are related. For many classes, if "any(elem==obj for obj in s)" is True, then "elem in s" will also be True. Conversely, it isn't unreasonable to expect this code to succeed: for elem in s: assert elem in s The decision to make __contains__ work whenever __iter__ is defined probably goes back to Py2.2. That seems to have worked out well for most users, so I don't see a reason to change that now. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron3200 at gmail.com Sun Oct 2 08:29:35 2011 From: ron3200 at gmail.com (Ron Adam) Date: Sun, 02 Oct 2011 01:29:35 -0500 Subject: [Python-ideas] Tweaking closures and lexical scoping to include the function being defined In-Reply-To: References: <1317344868.2369.32.camel@Gutsy> <1317353726.2369.146.camel@Gutsy> <1317360304.4082.22.camel@Gutsy> <20110930164535.GA2286@chopin.edu.pl> <20110930213231.GB3996@chopin.edu.pl> <20111001114428.GA3428@chopin.edu.pl> <1317488301.10154.47.camel@Gutsy> <877h4oh75c.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <1317536975.12256.168.camel@Gutsy> On Sat, 2011-10-01 at 22:11 -0400, Nick Coghlan wrote: +1 on all of the zen statements of course. I think you made a fine case for being careful and mindful about this stuff. :-) > Namespaces are one honking great idea -- let's do more of those! > - closures are just another form of namespace, even though people > typically think of classes, modules and packages when contemplating > this precept. "Function state variables" would be formalising the > namespace where default argument values live (albeit anonymously) and > making it available for programmatic use. One way to think of this is, Private, Shared, and Public, name spaces. Private and Public are locals and globals, and are pretty well supported, but Shared names spaces, (closures or otherwise) are not well supported. I think the whole concept of explicit shared name spaces, separate from globals and locals is quite important and should be done carefully. I don't think it is just about one or two use-cases that a small tweak will cover. > Despite its flaws, the simple brackets enclosed list after the > function parameter list is still my current favourite: > > def global_counter(x) [n=0, lock=Lock()]: > with lock: > n += 1 > yield n > > It just composes more nicely with decorators than the main alternative > still standing and is less prone to overwhelming the function name > with extraneous implementation details: This syntax to me looks a bit too close to a list literal. How about a name space literal? ie.. a dictionary. def global_counter(x) {n:0, lock=lock}: with lock: n += 1 yield n I think that looks better than dict(n=0, lock=lock). And when used as a repr for name spaces, it is more readable. A literal would cover the default values use case quite nicely. A reference to a pre-defined dictionary would cover values shared between different functions independent of scope. > @contextmanager > def counted() [active=collections.Counter(), > lock=threading.RLock()]: > far more clearly conveys "this defines a context manager named > 'counted'" than the following does: > @contextmanager > @(active=collections.Counter(), lock=threading.RLock()) > def counted(): Putting them after the function signature will result in more wrapped function signatures. While its very interesting to try to find a solution, I am also concerned about what this might mean in the long term. Particularly we will see more meta programming. Being able to initiate an object from one or more other objects can be very nice. Python does that sort of thing all over the place. Cheer, Ron From cmjohnson.mailinglist at gmail.com Sun Oct 2 08:43:38 2011 From: cmjohnson.mailinglist at gmail.com (Carl Matthew Johnson) Date: Sat, 1 Oct 2011 20:43:38 -1000 Subject: [Python-ideas] Tweaking closures and lexical scoping to include the function being defined In-Reply-To: <1317536975.12256.168.camel@Gutsy> References: <1317344868.2369.32.camel@Gutsy> <1317353726.2369.146.camel@Gutsy> <1317360304.4082.22.camel@Gutsy> <20110930164535.GA2286@chopin.edu.pl> <20110930213231.GB3996@chopin.edu.pl> <20111001114428.GA3428@chopin.edu.pl> <1317488301.10154.47.camel@Gutsy> <877h4oh75c.fsf@uwakimon.sk.tsukuba.ac.jp> <1317536975.12256.168.camel@Gutsy> Message-ID: <0C59FB88-E30F-47CC-B7BA-665F960991C2@gmail.com> On Oct 1, 2011, at 8:29 PM, Ron Adam wrote: > > How about a name space literal? ie.. a dictionary. > > def global_counter(x) {n:0, lock=lock}: > with lock: > n += 1 > yield n Yeah, but it would break this existing code: from __future__ import braces From stephen at xemacs.org Sun Oct 2 09:26:13 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 02 Oct 2011 16:26:13 +0900 Subject: [Python-ideas] Tweaking closures and lexical scoping to include the function being defined In-Reply-To: <1317501395.10536.53.camel@Gutsy> References: <1317344868.2369.32.camel@Gutsy> <1317353726.2369.146.camel@Gutsy> <1317360304.4082.22.camel@Gutsy> <20110930164535.GA2286@chopin.edu.pl> <20110930213231.GB3996@chopin.edu.pl> <20111001114428.GA3428@chopin.edu.pl> <1317488301.10154.47.camel@Gutsy> <877h4oh75c.fsf@uwakimon.sk.tsukuba.ac.jp> <1317501395.10536.53.camel@Gutsy> Message-ID: <8762k7hn1m.fsf@uwakimon.sk.tsukuba.ac.jp> Ron Adam writes: > Pep 318 says it should translate to ... [...] > add(3) ---> 5 # call wrapper with x, which calls add with (x, y) [...] > But the behavior is as you suggest... [...] > add(3) ---> 5 That's what I get, too: the same result using decorator syntax and using an explicit call to supply_y. What's the problem? As far of the subject of the thread goes, though, I don't see how this can help. The problem isn't that there's no way to refer to the decorated function, because there is a reference to it (the argument to the actual decorator, eg, the third foo in "foo = deco(foo)(foo)" which is an "anonymous reference" in the decorator implementation in CPython). The problem isn't referring to the name of the function, either; we don't care what the name is here. From stephen at xemacs.org Sun Oct 2 10:16:59 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 02 Oct 2011 17:16:59 +0900 Subject: [Python-ideas] Tweaking closures and lexical scoping to include the function being defined In-Reply-To: References: <1317344868.2369.32.camel@Gutsy> <1317353726.2369.146.camel@Gutsy> <1317360304.4082.22.camel@Gutsy> <20110930164535.GA2286@chopin.edu.pl> <20110930213231.GB3996@chopin.edu.pl> <20111001114428.GA3428@chopin.edu.pl> <1317488301.10154.47.camel@Gutsy> <877h4oh75c.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <874nzrhkp0.fsf@uwakimon.sk.tsukuba.ac.jp> Nick Coghlan writes: > It isn't quite - the name binding doesn't happen until *after* the > decorator chain has been invoked, so the function is anonymous while > the decorators are executing. As I understand the issue here, as far as the decorators are concerned, the reference passed by the decorator syntax should be enough to do any namespace manipulations that are possible in a (non-magic) decorator. Am I missing something? That is, in > the following closure idiom: > > def (): > NAME = EXPR > def FUNC(ARGLIST): surely the FUNC above ... > """DOC""" > nonlocal NAME > BODY > return FUNC ... doesn't need to be the *same identifier* as the FUNC below? > FUNC = () Isn't magic needed solely to inject the nonlocal statement(s) into the definition of FUNC inside at compile-time? From masklinn at masklinn.net Sun Oct 2 13:21:35 2011 From: masklinn at masklinn.net (Masklinn) Date: Sun, 2 Oct 2011 13:21:35 +0200 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> Message-ID: <1E944270-5E3E-4F3A-873B-48C4A3C5205F@masklinn.net> On 2011-10-02, at 03:01 , Nick Coghlan wrote: > On Sat, Oct 1, 2011 at 4:24 PM, Chris Rebert wrote: >> Requiring the explicit marking of a class as a sequence by inheriting >> from the Sequence ABC in order to get such default behavior "for free" >> seems quite reasonable. And having containment defined by default on >> potentially-infinite iterators seems unwise. +1 on the suggested >> removals. > > -1 to any removals - fallback protocols are the heart of duck-typing I very much disagree with this assertion. In fact I'd make the opposite one: fallback protocols are a perversion of duck-typing and only serve to make it less reliable and less predictable. In keeping with ducks, you were looking for one, didn't find anything which quacked or looked like a duck. The fallback protocol kicks in, you get something with feathers which you found near water and went "good enough". Now you drop it from 10000m because you're looking into the efficiency of a duck's flight starting airborne, and observe your quite dismayed penguin barreling towards the ground. A few seconds later, you find yourself not with additional experimental data but with a small indentation in the earth and a big mess all over it. From masklinn at masklinn.net Sun Oct 2 13:28:53 2011 From: masklinn at masklinn.net (Masklinn) Date: Sun, 2 Oct 2011 13:28:53 +0200 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <4E87BAD9.2070501@pearwood.info> References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> Message-ID: <4D758E1A-9C84-4ABA-AEE7-6A9D7FA359B3@masklinn.net> On 2011-10-02, at 03:14 , Steven D'Aprano wrote: > These changes don't sound even close to reasonable to me. It seems to me that the OP is making a distinction that doesn't exist. > > If you can write this: > > x = collection[0]; do_something_with(x) > x = collection[1]; do_something_with(x) > x = collection[2]; do_something_with(x) > # ... etc. > > then you can write it in a loop by hand: > > i = -1 > try: > while True: > i += 1 > x = collection[i] > do_something_with(x) > except IndexError: > pass > > But that's just a for-loop in disguise. The for-loop protocol goes all the way back to Python 1.5 and surely even older. You should, and can, be able to write this: > > for x in collection: > do_something_with(x) > > Requiring collection to explicitly inherit from a Sequence ABC breaks duck typing and is anti-Pythonic. > > I can't comprehend a use-case where manually extracting collection[i] for sequential values of i should succeed You can write this: x = collection['foo']; do_something_with(x) x = collection['bar']; do_something_with(x) x = collection['baz']; do_something_with(x) you can't write either of the other two options, but since Python calls the exact same method, if you somehow do a containment check (or an iteration) of a simple k:v collection instead of getting a clear exception about a missing `__iter__` or `__contains__` you get a not-very-informative `KeyError: 0` 3 or 4 levels down the stack, and now have to hunt how in hell's name somebody managed to call `collection[0]`. From victor.stinner at haypocalc.com Sun Oct 2 13:59:36 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Sun, 2 Oct 2011 13:59:36 +0200 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <8A182379-B81C-4F43-810B-E139DA843E88@gmail.com> References: <20111001201320.567074bb@pitrou.net> <8A182379-B81C-4F43-810B-E139DA843E88@gmail.com> Message-ID: <201110021359.36994.victor.stinner@haypocalc.com> Le dimanche 2 octobre 2011 07:13:29, Raymond Hettinger a ?crit : > The decision to make __contains__ work whenever __iter__ is defined > probably goes back to Py2.2. That seems to have worked out well > for most users, so I don't see a reason to change that now. It is surprising in StringIO, so it should be fixed in IOBase, but no in Python. Victor From solipsis at pitrou.net Sun Oct 2 14:07:48 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 2 Oct 2011 14:07:48 +0200 Subject: [Python-ideas] __iter__ implies __contains__? References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> Message-ID: <20111002140748.0e0f818c@pitrou.net> On Sat, 1 Oct 2011 22:30:21 -0400 Nick Coghlan wrote: > On Sat, Oct 1, 2011 at 9:21 PM, Chris Rebert wrote: > > Actually, my suggestion was just that Sequence is one possible (but > > clean) way to obtain the behavior; one could *of course* reimplement > > the functionality without recourse to Sequence if they desired. > > But why would that would forcing everyone implementation standard > sequences to reimplement the wheel be an improvement over the status > quo? You don't reinvent the wheel if you accept to inherit from abc.Sequence. Similarly, if you want the IO stack to provide default implementations of some methods, you have to inherit from XXXIOBase. If you don't want to implement all 6 ordered comparison operators, you have to use the functools.total_ordering decorator (and this one has a bug). Regards Antoine. From ncoghlan at gmail.com Sun Oct 2 14:48:33 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 2 Oct 2011 08:48:33 -0400 Subject: [Python-ideas] Tweaking closures and lexical scoping to include the function being defined In-Reply-To: <874nzrhkp0.fsf@uwakimon.sk.tsukuba.ac.jp> References: <1317344868.2369.32.camel@Gutsy> <1317353726.2369.146.camel@Gutsy> <1317360304.4082.22.camel@Gutsy> <20110930164535.GA2286@chopin.edu.pl> <20110930213231.GB3996@chopin.edu.pl> <20111001114428.GA3428@chopin.edu.pl> <1317488301.10154.47.camel@Gutsy> <877h4oh75c.fsf@uwakimon.sk.tsukuba.ac.jp> <874nzrhkp0.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Sun, Oct 2, 2011 at 4:16 AM, Stephen J. Turnbull wrote: > surely the FUNC above ... > > ?> ? ? ? ? ? ? """DOC""" > ?> ? ? ? ? ? ? nonlocal NAME > ?> ? ? ? ? ? ? BODY > ?> ? ? ? ? return FUNC > > ... doesn't need to be the *same identifier* as the FUNC below? > > ?> ? ? FUNC = () > > Isn't magic needed solely to inject the nonlocal statement(s) into the > definition of FUNC inside at compile-time? Well, having 'FUNC' the same from the compiler's point of view is also necessary to get introspection to work properly (i.e. FUNC.__name__ == 'FUNC'). But yeah, the fundamental challenge lies in telling the compiler to change the way it binds and references certain names (like global and nonlocal declarations) while also being able to initialise them at function definition time (like default arguments). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun Oct 2 15:38:07 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 2 Oct 2011 09:38:07 -0400 Subject: [Python-ideas] Tweaking closures and lexical scoping to include the function being defined In-Reply-To: <1317536975.12256.168.camel@Gutsy> References: <1317344868.2369.32.camel@Gutsy> <1317353726.2369.146.camel@Gutsy> <1317360304.4082.22.camel@Gutsy> <20110930164535.GA2286@chopin.edu.pl> <20110930213231.GB3996@chopin.edu.pl> <20111001114428.GA3428@chopin.edu.pl> <1317488301.10154.47.camel@Gutsy> <877h4oh75c.fsf@uwakimon.sk.tsukuba.ac.jp> <1317536975.12256.168.camel@Gutsy> Message-ID: On Sun, Oct 2, 2011 at 2:29 AM, Ron Adam wrote: > On Sat, 2011-10-01 at 22:11 -0400, Nick Coghlan wrote: > > +1 on all of the zen statements of course. > > I think you made a fine case for being careful and mindful about this > stuff. ?:-) Heh, even if nothing else comes out of these threads, I can be happy with helping others to learn how to look at this kind of question from multiple angles without getting too locked in to one point of view (and getting more practice at doing so, myself, of course!) > One way to think of this is, Private, Shared, and Public, name spaces. > Private and Public ?are locals and globals, and are pretty well > supported, but Shared names spaces, (closures or otherwise) are not well > supported. > > I think the whole concept of explicit shared name spaces, separate from > globals and locals is quite important and should be done carefully. ?I > don't think it is just about one or two use-cases that a small tweak > will cover. "not well supported" seems a little too harsh in the post PEP 3104 'nonlocal' declaration era. If we look at the full suite of typical namespaces in Python, we currently have the following (note that read/write and read-only refer to the name bindings themselves - mutable objects can obviously still be modified for a reference that can't be rebound): Locals: naturally read/write Function state variables (aka default argument values): naturally read-only, very hard to rebind since this namespace is completely anonymous in normal usage Lexically scoped non-locals: naturally read-only, writable with nonlocal declaration Module globals: within functions in module, naturally read-only, writable with global declaration. At module level, naturally read/write. From outside the module, naturally read/write via module object Process builtins: naturally read-only, writable via "import builtins" and attribute assignment Instance variables: in methods, naturally read/write via 'self' object Class variables: in instance methods, naturally read-only, writable via 'type(self)' or 'self.__class__'. Naturally read/write in class methods via 'cls', 'klass' or 'class_' object. Of those, I would put lexical scoping, function state variables and class variables in the 'shared' category - they aren't as contained as locals and instance variables, but they aren't as easy to access as module globals and process builtins, either. The current discussion is about finding a syntax to bring function state variables on par with lexical scoping, such that default argument values are no longer such a unique case. > How about a name space literal? ie.. a dictionary. > > ? ?def global_counter(x) {n:0, lock=lock}: > ? ? ? ?with lock: > ? ? ? ? ? ?n += 1 > ? ? ? ? ? ?yield n > > I think that looks better than dict(n=0, lock=lock). ?And when used as a > repr for name spaces, it is more readable. The "but it looks like a list" argument doesn't really hold any water for me. Parameter and argument lists look like tuples, too, but people figure out from context that they mean something different and permit different content. I have some specific objections to the braces syntax, too: - I believe the assocation with func.__dict__ would be too strong (since it's actually unrelated) - braces and colons are a PITA to type compared to brackets and equals signs - the LHS in a dictionary is an ordinary expression, here it's an unquoted name [NAME=EXPR, NAME2=EXPR2] is clearly illegal as a list, so it must mean something else, perhaps something akin to what (NAME=EXPR, NAME2=EXPR2) would have meant in the immediately preceding parameter list (this intuition would be correct, since the two are closely related, differing only in the scope of any rebindings of the names in the function body). {NAME=EXPR, NAME2=EXPR}, on the other hand, looks an awful lot like {NAME:EXPR, NAME2:EXPR2}, which would be an ordinary dict literal, and *not* particularly related to what the new syntax would mean. > A literal would cover the default values use case quite nicely. ?A > reference to a pre-defined dictionary would cover values shared between > different functions independent of scope. No, that can never work (it's akin to the old "from module import *" at function level, which used to disable fast locals but is now simply not allowed). The names for any shared state *must* be explicit in the syntax so that the compiler knows what they are. When that isn't adequate it's a sign that it's time to upgrade to a full class or closure. >> ? ? @contextmanager >> ? ? def counted() ?[active=collections.Counter(), >> ?lock=threading.RLock()]: > >> far more clearly conveys "this defines a context manager named >> 'counted'" than the following does: > >> ? ? @contextmanager >> ? ? @(active=collections.Counter(), lock=threading.RLock()) >> ? ? def counted(): > > > Putting them after the function signature will result in more wrapped > function signatures. Agreed, but even there I think I prefer that outcome, since the more important information (name and signature) precedes the less important (the state variable initialisation). Worst case, someone can put their state in a named tuple or class instance to reduce the noise in the header line - state variables are about approaching a problem in a different way (i.e. algorithm more prominent than state) rather than about avoiding the use of structured data altogether. > While its very interesting to try to find a solution, I am also > concerned about what this might mean in the long term. ?Particularly we > will see more meta programming. ?Being able to initiate an object from > one or more other objects can be very nice. ?Python does that sort of > thing all over the place. I'm not sure I understand what you mean in your use of the term 'meta-programming' here. The biggest danger to my mind is that we'll see more true process-level globals as state on top-level functions, and those genuinely *can* be problematic (but also very useful, which is why C has them). It's really no worse than class variables, though. The other objection to further enhancing the power of functions to maintain state is that functions aren't naturally decomposable the way classes are - if an algorithm is written cleanly as methods on a class, then you can override just the pieces you need to modify while leaving the overall structure intact. For functions, it's much harder to do the same thing (hence generators, coroutines and things like the visitor pattern when walking data structures). My main counters to those objections are that: 1. Any feature of this new proposal can already be done with explicit closures or the default argument hack. While usage may increase slightly with an officially blessed syntax, I don't expect that to happen to any great extent - I'm more hoping that over time, the default argument hack usages would get replaced 2. When an algorithm inevitably runs up against the practical limits of any new syntax, the full wealth of Python remains available for refactoring (e.g. by upgrading to a full class or closure) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From stephen at xemacs.org Sun Oct 2 16:05:44 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 02 Oct 2011 23:05:44 +0900 Subject: [Python-ideas] Introspecting decorated functions? [was: Tweaking closures and lexical scoping to include the function being defined] In-Reply-To: References: <1317344868.2369.32.camel@Gutsy> <1317353726.2369.146.camel@Gutsy> <1317360304.4082.22.camel@Gutsy> <20110930164535.GA2286@chopin.edu.pl> <20110930213231.GB3996@chopin.edu.pl> <20111001114428.GA3428@chopin.edu.pl> <1317488301.10154.47.camel@Gutsy> <877h4oh75c.fsf@uwakimon.sk.tsukuba.ac.jp> <874nzrhkp0.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <8739fbh4jr.fsf@uwakimon.sk.tsukuba.ac.jp> Nick Coghlan writes: > On Sun, Oct 2, 2011 at 4:16 AM, Stephen J. Turnbull wrote: > > Isn't magic needed solely to inject the nonlocal statement(s) into the > > definition of FUNC inside at compile-time? > > Well, having 'FUNC' the same from the compiler's point of view is also > necessary to get introspection to work properly (i.e. FUNC.__name__ == > 'FUNC'). Not only isn't that magic, but it doesn't currently work anyway, at least not for me in Python 3.2 (borrowing Ron's example): >>> def artdeco(y): ... def _(func): ... def wrapper(x): ... return func(x,y) ... return wrapper ... return _ ... >>> def baz(x, y): ... return x + y ... >>> baz = artdeco(2)(baz) >>> baz.__name__ 'wrapper' As expected, but Expedia Per Diem! and >>> @artdeco(3) ... def quux(x, y): ... return y - x ... >>> quux.__name__ 'wrapper' Woops! As for the "not magic" claim: >>> def dc(y): ... def _(func): ... def wrapper(x): ... return func(x,y) ... wrapper.__name__ = func.__name__ ... return wrapper ... return _ ... >>> @dc(1) ... def doom(x, y): ... return x + y ... >>> doom.__name__ 'doom' >>> Should a bug be filed, or is this already part of your "improved introspection for closures" proposal? From steve at pearwood.info Sun Oct 2 16:10:33 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 03 Oct 2011 01:10:33 +1100 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <20111002140748.0e0f818c@pitrou.net> References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <20111002140748.0e0f818c@pitrou.net> Message-ID: <4E8870D9.3080009@pearwood.info> Antoine Pitrou wrote: > On Sat, 1 Oct 2011 22:30:21 -0400 > Nick Coghlan wrote: >> On Sat, Oct 1, 2011 at 9:21 PM, Chris Rebert wrote: >>> Actually, my suggestion was just that Sequence is one possible (but >>> clean) way to obtain the behavior; one could *of course* reimplement >>> the functionality without recourse to Sequence if they desired. >> But why would that would forcing everyone implementation standard >> sequences to reimplement the wheel be an improvement over the status >> quo? > > You don't reinvent the wheel if you accept to inherit from abc.Sequence. You shouldn't be forced to inherit from abc.Sequence to implement the sequence protocols. The whole point of protocols is that they you don't need inheritance to make them work, let alone buy into the ABC mindset. If you have a class that you don't want to be iterable but otherwise obeys the iteration protocol, that is easy to fix: have __iter__ raise TypeError. An easy fix for unusual and trivial problem. >>> class Test: ... def __getitem__(self, i): ... return i ... def __iter__(self): ... raise TypeError ... >>> for i in Test(): ... print(i) ... Traceback (most recent call last): File "", line 1, in File "", line 5, in __iter__ TypeError -- Steven From steve at pearwood.info Sun Oct 2 16:30:29 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 03 Oct 2011 01:30:29 +1100 Subject: [Python-ideas] Tweaking closures and lexical scoping to include the function being defined In-Reply-To: <0C59FB88-E30F-47CC-B7BA-665F960991C2@gmail.com> References: <1317344868.2369.32.camel@Gutsy> <1317353726.2369.146.camel@Gutsy> <1317360304.4082.22.camel@Gutsy> <20110930164535.GA2286@chopin.edu.pl> <20110930213231.GB3996@chopin.edu.pl> <20111001114428.GA3428@chopin.edu.pl> <1317488301.10154.47.camel@Gutsy> <877h4oh75c.fsf@uwakimon.sk.tsukuba.ac.jp> <1317536975.12256.168.camel@Gutsy> <0C59FB88-E30F-47CC-B7BA-665F960991C2@gmail.com> Message-ID: <4E887585.7090505@pearwood.info> Carl Matthew Johnson wrote: > On Oct 1, 2011, at 8:29 PM, Ron Adam wrote: >> How about a name space literal? ie.. a dictionary. >> >> def global_counter(x) {n:0, lock=lock}: >> with lock: >> n += 1 >> yield n > > Yeah, but it would break this existing code: > > from __future__ import braces Is that an attempt to be funny? Because braces are already used for dictionary and set literals. The __future__ braces refers to braces as BEGIN ... END delimiters of code blocks, not any use of braces at all. While we're throwing around colours for the bike-shed, the colour which seems to look best to me is: def global_counter(x, [n=0, lock=lock]): # inside the parameter list ... rather than def global_counter(x) [n=0, lock=lock]: # outside the parameter list ... but Ron's suggestion that we use dictionary syntax does seem intriguing. -- Steven From ncoghlan at gmail.com Sun Oct 2 16:36:46 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 2 Oct 2011 10:36:46 -0400 Subject: [Python-ideas] Introspecting decorated functions? [was: Tweaking closures and lexical scoping to include the function being defined] In-Reply-To: <8739fbh4jr.fsf@uwakimon.sk.tsukuba.ac.jp> References: <1317344868.2369.32.camel@Gutsy> <1317353726.2369.146.camel@Gutsy> <1317360304.4082.22.camel@Gutsy> <20110930164535.GA2286@chopin.edu.pl> <20110930213231.GB3996@chopin.edu.pl> <20111001114428.GA3428@chopin.edu.pl> <1317488301.10154.47.camel@Gutsy> <877h4oh75c.fsf@uwakimon.sk.tsukuba.ac.jp> <874nzrhkp0.fsf@uwakimon.sk.tsukuba.ac.jp> <8739fbh4jr.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Sun, Oct 2, 2011 at 10:05 AM, Stephen J. Turnbull wrote: > Should a bug be filed, or is this already part of your "improved > introspection for closures" proposal? There's a reason functools.wraps [1] exists (despite the way it works being an egregious hack) :) But that's also the reason I've been careful in my examples of equivalent semantics to make sure the inner function name matches the eventually bound name in the outer scope - without overwriting metadata the way @wraps(f) does, duplicating information in the original source code is the only other way to get informative introspection results. [1] http://docs.python.org/library/functools#functools.wraps Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Sun Oct 2 16:36:46 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 2 Oct 2011 16:36:46 +0200 Subject: [Python-ideas] __iter__ implies __contains__? References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <20111002140748.0e0f818c@pitrou.net> <4E8870D9.3080009@pearwood.info> Message-ID: <20111002163646.5a4755d4@pitrou.net> On Mon, 03 Oct 2011 01:10:33 +1100 Steven D'Aprano wrote: > Antoine Pitrou wrote: > > On Sat, 1 Oct 2011 22:30:21 -0400 > > Nick Coghlan wrote: > >> On Sat, Oct 1, 2011 at 9:21 PM, Chris Rebert wrote: > >>> Actually, my suggestion was just that Sequence is one possible (but > >>> clean) way to obtain the behavior; one could *of course* reimplement > >>> the functionality without recourse to Sequence if they desired. > >> But why would that would forcing everyone implementation standard > >> sequences to reimplement the wheel be an improvement over the status > >> quo? > > > > You don't reinvent the wheel if you accept to inherit from abc.Sequence. > > You shouldn't be forced to inherit from abc.Sequence to implement the > sequence protocols. The whole point of protocols is that they you don't > need inheritance to make them work, let alone buy into the ABC mindset. Again, nobody said you had to inherit from abc.Sequence. It just provides a convenience. If you prefer to implement everything by hand, then fine. You already have __reversed__(), index() and count() to write, so I'm not sure why __contains__() would be scary or annoying. From ncoghlan at gmail.com Sun Oct 2 16:43:40 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 2 Oct 2011 10:43:40 -0400 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <1E944270-5E3E-4F3A-873B-48C4A3C5205F@masklinn.net> References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <1E944270-5E3E-4F3A-873B-48C4A3C5205F@masklinn.net> Message-ID: On Sun, Oct 2, 2011 at 7:21 AM, Masklinn wrote: > > On 2011-10-02, at 03:01 , Nick Coghlan wrote: > >> On Sat, Oct 1, 2011 at 4:24 PM, Chris Rebert wrote: >>> Requiring the explicit marking of a class as a sequence by inheriting >>> from the Sequence ABC in order to get such default behavior "for free" >>> seems quite reasonable. And having containment defined by default on >>> potentially-infinite iterators seems unwise. +1 on the suggested >>> removals. >> >> -1 to any removals - fallback protocols are the heart of duck-typing > I very much disagree with this assertion. In fact I'd make the opposite > one: fallback protocols are a perversion of duck-typing and only serve > to make it less reliable and less predictable. > > In keeping with ducks, you were looking for one, didn't find anything > which quacked or looked like a duck. The fallback protocol kicks in, > you get something with feathers which you found near water and went > "good enough". Now you drop it from 10000m because you're looking into > the efficiency of a duck's flight starting airborne, and observe your > quite dismayed penguin barreling towards the ground. > > A few seconds later, you find yourself not with additional experimental > data but with a small indentation in the earth and a big mess all over > it. I love that imagery :) However, it's the kind of situation that's part and parcel of duck typing - you try things and see if they work and the occasional penguin gets it in the neck. If that's inadequate for a given use case, you define an ABC and register only things you've already checked and found to behave correctly (although beware if you register Bird rather than FlyingBird - the penguins, emus and friends may still be in trouble at that point) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun Oct 2 16:55:40 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 2 Oct 2011 10:55:40 -0400 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <20111002163646.5a4755d4@pitrou.net> References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <20111002140748.0e0f818c@pitrou.net> <4E8870D9.3080009@pearwood.info> <20111002163646.5a4755d4@pitrou.net> Message-ID: On Sun, Oct 2, 2011 at 10:36 AM, Antoine Pitrou wrote: > Again, nobody said you had to inherit from abc.Sequence. It just > provides a convenience. If you prefer to implement everything by hand, > then fine. You already have __reversed__(), index() and count() to > write, so I'm not sure why __contains__() would be scary or annoying. The case can be made that index() and count() should be based on a len() style protocol rather than methods (for the same reasons that len() is a protocol rather than a direct method call). As for __reversed__, once again, it's optional and not needed for standard sequences that provide __len__ and an index based __getitem__: >>> class MySeq: ... def __len__(self): ... return 10 ... def __getitem__(self, index): ... if 0 <= index < 10: ... return index ... raise IndexError(index) ... >>> seq = MySeq() >>> len(seq) 10 >>> list(seq) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> list(reversed(seq)) [9, 8, 7, 6, 5, 4, 3, 2, 1, 0] It's a judgement call as to how complex an interface can be before we decide what should be protocol based, what should be optional ABC based and what should require a specific concrete class. Strings are really the only class still in the last category. Several parts of the interpreter used to require real dictionaries, but those have been slowly culled over the years. Files are complex enough that an ABC hierarchy makes sense, but even there, many operations are defined that will accept anything implementing "enough" of the relevant IO methods rather than *requiring* that they be explicitly registered with the ABCs. Collections, however, are fundamental enough that they should ideally be fully protocol based. The ABCs exist to formalise the APIs, but we shouldn't be taking fallbacks out of the rest of the interpreter just because we have shiny new hammer to play with (in fact, the fear of that happening was one of the major objections to the introduction of a formal notion of ABCs in the first place). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun Oct 2 17:02:14 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 2 Oct 2011 11:02:14 -0400 Subject: [Python-ideas] Tweaking closures and lexical scoping to include the function being defined In-Reply-To: <4E887585.7090505@pearwood.info> References: <1317344868.2369.32.camel@Gutsy> <1317353726.2369.146.camel@Gutsy> <1317360304.4082.22.camel@Gutsy> <20110930164535.GA2286@chopin.edu.pl> <20110930213231.GB3996@chopin.edu.pl> <20111001114428.GA3428@chopin.edu.pl> <1317488301.10154.47.camel@Gutsy> <877h4oh75c.fsf@uwakimon.sk.tsukuba.ac.jp> <1317536975.12256.168.camel@Gutsy> <0C59FB88-E30F-47CC-B7BA-665F960991C2@gmail.com> <4E887585.7090505@pearwood.info> Message-ID: On Sun, Oct 2, 2011 at 10:30 AM, Steven D'Aprano wrote: > While we're throwing around colours for the bike-shed, the colour which > seems to look best to me is: > > > def global_counter(x, [n=0, lock=lock]): ?# inside the parameter list > ? ?... > > rather than > > def global_counter(x) [n=0, lock=lock]: ?# outside the parameter list > ? ?... The main problem I have with 'inside the parameter' list is the way it looks when there's only state and no arguments: def f([state=State()]): ... One could be forgiven for thinking that f() accepts a single optional positional argument in that case. Separating the two, on the other hand, would keep things clear: def f() [state=State()]: ... Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From masklinn at masklinn.net Sun Oct 2 17:39:00 2011 From: masklinn at masklinn.net (Masklinn) Date: Sun, 2 Oct 2011 17:39:00 +0200 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <4E8870D9.3080009@pearwood.info> References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <20111002140748.0e0f818c@pitrou.net> <4E8870D9.3080009@pearwood.info> Message-ID: On 2011-10-02, at 16:10 , Steven D'Aprano wrote: > > If you have a class that you don't want to be iterable but otherwise obeys the iteration protocol, that is easy to fix: have __iter__ raise TypeError. An easy fix for unusual and trivial problem. Yes but you have to know it exists in the first place, and it is not obvious that `in` without `__contains__` will use `__iter__`, let alone that `iter()` without `__iter__` will use `__getitem__` (and I submit as my case study in not being obvious that *Antoine* was surprised by this behavior). From jeanpierreda at gmail.com Sun Oct 2 18:39:59 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Sun, 2 Oct 2011 12:39:59 -0400 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <4E87BAD9.2070501@pearwood.info> References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> Message-ID: > Since iteration over elements is at the heart of containment tests, the same > reasoning applies to __contains__. I was with you for the old-style sequence iteration API, but you've lost me here. I *can* imagine use-cases where "in" shouldn't work: pretty much any iterator. Doesn't it seem strange that `x in A` should succeed, but then `x in A` should fail? Devin On Sat, Oct 1, 2011 at 9:14 PM, Steven D'Aprano wrote: > Chris Rebert wrote: > >> >>> >>> the latter kicks in any time an object with no __iter__ and a __getitem__ >>> is tentatively iterated, I've made that error a few times with >>> insufficiently defined dict-like objects finding themselves (rightly >>> or wrongly) being iterated. >> >> Requiring the explicit marking of a class as a sequence by inheriting >> from the Sequence ABC in order to get such default behavior "for free" >> seems quite reasonable. And having containment defined by default on >> potentially-infinite iterators seems unwise. +1 on the suggested >> removals. > > These changes don't sound even close to reasonable to me. It seems to me > that the OP is making a distinction that doesn't exist. > > If you can write this: > > x = collection[0]; do_something_with(x) > x = collection[1]; do_something_with(x) > x = collection[2]; do_something_with(x) > # ... etc. > > then you can write it in a loop by hand: > > i = -1 > try: > ? ?while True: > ? ? ? ?i += 1 > ? ? ? ?x = collection[i] > ? ? ? ?do_something_with(x) > except IndexError: > ? ?pass > > > But that's just a for-loop in disguise. The for-loop protocol goes all the > way back to Python 1.5 and surely even older. You should, and can, be able > to write this: > > for x in collection: > ? ?do_something_with(x) > > > Requiring collection to explicitly inherit from a Sequence ABC breaks duck > typing and is anti-Pythonic. > > I can't comprehend a use-case where manually extracting collection[i] for > sequential values of i should succeed, but doing it in a for-loop should > fail. But if you have such a use-case, feel free to define __iter__ to raise > an exception. > > Since iteration over elements is at the heart of containment tests, the same > reasoning applies to __contains__. > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From ron3200 at gmail.com Sun Oct 2 19:05:17 2011 From: ron3200 at gmail.com (Ron Adam) Date: Sun, 02 Oct 2011 12:05:17 -0500 Subject: [Python-ideas] Tweaking closures and lexical scoping to include the function being defined In-Reply-To: References: <1317344868.2369.32.camel@Gutsy> <1317353726.2369.146.camel@Gutsy> <1317360304.4082.22.camel@Gutsy> <20110930164535.GA2286@chopin.edu.pl> <20110930213231.GB3996@chopin.edu.pl> <20111001114428.GA3428@chopin.edu.pl> <1317488301.10154.47.camel@Gutsy> <877h4oh75c.fsf@uwakimon.sk.tsukuba.ac.jp> <1317536975.12256.168.camel@Gutsy> Message-ID: <1317575117.13548.75.camel@Gutsy> On Sun, 2011-10-02 at 09:38 -0400, Nick Coghlan wrote: > On Sun, Oct 2, 2011 at 2:29 AM, Ron Adam wrote: > > On Sat, 2011-10-01 at 22:11 -0400, Nick Coghlan wrote: > > > > +1 on all of the zen statements of course. > > > > I think you made a fine case for being careful and mindful about this > > stuff. :-) > > Heh, even if nothing else comes out of these threads, I can be happy > with helping others to learn how to look at this kind of question from > multiple angles without getting too locked in to one point of view > (and getting more practice at doing so, myself, of course!) Yes, that's a very zen way to look at it. +1 Keeping that larger picture in mind, while sorting though various smaller options is challenging. Hopefully in the end, the best solution, (which may include doing nothing), will be sorted out. > > One way to think of this is, Private, Shared, and Public, name spaces. > > Private and Public are locals and globals, and are pretty well > > supported, but Shared names spaces, (closures or otherwise) are not well > > supported. > > > > I think the whole concept of explicit shared name spaces, separate from > > globals and locals is quite important and should be done carefully. I > > don't think it is just about one or two use-cases that a small tweak > > will cover. > > "not well supported" seems a little too harsh in the post PEP 3104 > 'nonlocal' declaration era. I think the introspection tools you want will help. hmm... what about the vars() function? (I tend to forget that one) vars(...) vars([object]) -> dictionary Without arguments, equivalent to locals(). With an argument, equivalent to object.__dict__. Could we extend that to see closures and scope visible names? > If we look at the full suite of typical > namespaces in Python, we currently have the following (note that > read/write and read-only refer to the name bindings themselves - > mutable objects can obviously still be modified for a reference that > can't be rebound): > > Locals: naturally read/write > Function state variables (aka default argument values): naturally > read-only, very hard to rebind since this namespace is completely > anonymous in normal usage > Lexically scoped non-locals: naturally read-only, writable with > nonlocal declaration > Module globals: within functions in module, naturally read-only, > writable with global declaration. At module level, naturally > read/write. From outside the module, naturally read/write via module > object > Process builtins: naturally read-only, writable via "import builtins" > and attribute assignment > Instance variables: in methods, naturally read/write via 'self' object > Class variables: in instance methods, naturally read-only, writable > via 'type(self)' or 'self.__class__'. Naturally read/write in class > methods via 'cls', 'klass' or 'class_' object. > > Of those, I would put lexical scoping, function state variables and > class variables in the 'shared' category - they aren't as contained as > locals and instance variables, but they aren't as easy to access as > module globals and process builtins, either. I think it may be easier to classify them in terms of how they are stored. Cell based names spaces: function locals function closures Dictionary based names spaces: class attributes module globals builtins If vars() could get closures, What exactly would it do and how would the output look? Would it indicate which was free variables, from cell variables? [clipped literal parts, for now] > > While its very interesting to try to find a solution, I am also > > concerned about what this might mean in the long term. Particularly we > > will see more meta programming. Being able to initiate an object from > > one or more other objects can be very nice. Python does that sort of > > thing all over the place. > > I'm not sure I understand what you mean in your use of the term > 'meta-programming' here. The biggest danger to my mind is that we'll > see more true process-level globals as state on top-level functions, > and those genuinely *can* be problematic (but also very useful, which > is why C has them). It's really no worse than class variables, though. I'm thinking of automated program generation. A programming language that has a lot of hard syntax without a way to do the same things in a dynamic way makes it harder to do that. You pretty much have to resort to exec and eval in those cases. Or avoid those features. > The other objection to further enhancing the power of functions to > maintain state is that functions aren't naturally decomposable the way > classes are - if an algorithm is written cleanly as methods on a > class, then you can override just the pieces you need to modify while > leaving the overall structure intact. For functions, it's much harder > to do the same thing (hence generators, coroutines and things like the > visitor pattern when walking data structures). I would like very much for functions to be a bit more decomposable. > My main counters to those objections are that: > 1. Any feature of this new proposal can already be done with explicit > closures or the default argument hack. While usage may increase > slightly with an officially blessed syntax, I don't expect that to > happen to any great extent - I'm more hoping that over time, the > default argument hack usages would get replaced The part I like is that it removes a part of functions signatures, which I think are already over extended. Although, only a small part. > 2. When an algorithm inevitably runs up against the practical limits > of any new syntax, the full wealth of Python remains available for > refactoring (e.g. by upgrading to a full class or closure) I agree, any solution should be compared to an alternative non-syntax way of doing it. All very interesting... Cheers, Ron From guido at python.org Sun Oct 2 19:05:50 2011 From: guido at python.org (Guido van Rossum) Date: Sun, 2 Oct 2011 10:05:50 -0700 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <8A182379-B81C-4F43-810B-E139DA843E88@gmail.com> References: <20111001201320.567074bb@pitrou.net> <8A182379-B81C-4F43-810B-E139DA843E88@gmail.com> Message-ID: On Sat, Oct 1, 2011 at 10:13 PM, Raymond Hettinger wrote: > > On Oct 1, 2011, at 2:13 PM, Antoine Pitrou wrote: > > I honestly didn't know we exposed such semantics, and I'm wondering if > the functionality is worth the astonishement: > > Since both __iter__ and __contains__ are deeply tied to "in-ness", > it isn't really astonishing that they are related. > For many classes, if "any(elem==obj for obj in s)" is True, > then?"elem in s" will also be True. > Conversely, it isn't unreasonable to expect this code to succeed: > ? ?for elem in s: > ? ? ? ? ?assert elem in s > The decision to make __contains__ work whenever __iter__ is defined > probably goes back to Py2.2. ? That seems to have worked out well > for most users, so I don't see a reason to change that now. +1 -- --Guido van Rossum (python.org/~guido) From guido at python.org Sun Oct 2 19:28:13 2011 From: guido at python.org (Guido van Rossum) Date: Sun, 2 Oct 2011 10:28:13 -0700 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: References: <20111001201320.567074bb@pitrou.net> <8A182379-B81C-4F43-810B-E139DA843E88@gmail.com> Message-ID: On Sun, Oct 2, 2011 at 10:05 AM, Guido van Rossum wrote: > On Sat, Oct 1, 2011 at 10:13 PM, Raymond Hettinger > wrote: >> >> On Oct 1, 2011, at 2:13 PM, Antoine Pitrou wrote: >> >> I honestly didn't know we exposed such semantics, and I'm wondering if >> the functionality is worth the astonishement: >> >> Since both __iter__ and __contains__ are deeply tied to "in-ness", >> it isn't really astonishing that they are related. >> For many classes, if "any(elem==obj for obj in s)" is True, >> then?"elem in s" will also be True. >> Conversely, it isn't unreasonable to expect this code to succeed: >> ? ?for elem in s: >> ? ? ? ? ?assert elem in s >> The decision to make __contains__ work whenever __iter__ is defined >> probably goes back to Py2.2. ? That seems to have worked out well >> for most users, so I don't see a reason to change that now. > > +1 Correction, I read this the way Raymond meant it, not the way he wrote it, and hit Send too quickly. :-( The problem here seems to be that collections/abc.py defines Iterable to have __iter__ but not __contains__, but the Python language defines the 'in' operator as trying __contains__ first, and if that is not defined, using __iter__. This is not surprising given Python's history, but it does cause some confusion when one compares the ABCs with the actual behavior. I also think that the way the ABCs have it makes more sense -- for single-use iterables (like files) the default behavior of "in" exhausts the iterator which is costly and fairly useless. Now, should we change "in" to only look for __contains__ and not fall back on __iter__? If we were debating Python 3's feature set I would probably agree with that, as a clean break with the past and a clear future. Since we're debating Python 3.3, however, I think we should just lay it to rest and use the fallback solution proposed: define __contains__ on files to raise TypeError, and leave the rest alone. Maybe make a note for Python 4. Maybe add a recommendation to PEP 8 to always implement __contains__ if you implement __iter__. But let's not break existing code that depends on the current behavior -- we have better things to do than to break perfectly fine working code in a fit of pedantry. -- --Guido van Rossum (python.org/~guido) From steve at pearwood.info Sun Oct 2 21:31:08 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 03 Oct 2011 06:31:08 +1100 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> Message-ID: <4E88BBFC.7060705@pearwood.info> Devin Jeanpierre wrote: >> Since iteration over elements is at the heart of containment tests, the same >> reasoning applies to __contains__. > > I was with you for the old-style sequence iteration API, but you've > lost me here. I *can* imagine use-cases where "in" shouldn't work: > pretty much any iterator. Doesn't it seem strange that `x in A` should > succeed, but then `x in A` should fail? No. That is implied by the iterator protocol, just like: >>> A = iter([1, 2, 3, 4]) >>> sum(A) 10 >>> sum(A) 0 If somebody is surprised that walking over an iterator for *any* reason changes the state of the iterator, they really haven't thought things through. -- Steven From solipsis at pitrou.net Sun Oct 2 21:57:43 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 2 Oct 2011 21:57:43 +0200 Subject: [Python-ideas] __iter__ implies __contains__? References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <4E88BBFC.7060705@pearwood.info> Message-ID: <20111002215743.68a3e1ab@pitrou.net> On Mon, 03 Oct 2011 06:31:08 +1100 Steven D'Aprano wrote: > Devin Jeanpierre wrote: > >> Since iteration over elements is at the heart of containment tests, the same > >> reasoning applies to __contains__. > > > > I was with you for the old-style sequence iteration API, but you've > > lost me here. I *can* imagine use-cases where "in" shouldn't work: > > pretty much any iterator. Doesn't it seem strange that `x in A` should > > succeed, but then `x in A` should fail? > > No. That is implied by the iterator protocol, just like: > > >>> A = iter([1, 2, 3, 4]) > >>> sum(A) > 10 > >>> sum(A) > 0 > > If somebody is surprised that walking over an iterator for *any* reason > changes the state of the iterator, they really haven't thought things > through. Hello? The issue is not that walking over an iterator changes the state of the iterator, it is that "x in A" iterates over A at all. From ron3200 at gmail.com Sun Oct 2 22:24:01 2011 From: ron3200 at gmail.com (Ron Adam) Date: Sun, 02 Oct 2011 15:24:01 -0500 Subject: [Python-ideas] Tweaking closures and lexical scoping to include the function being defined In-Reply-To: <874nzrhkp0.fsf@uwakimon.sk.tsukuba.ac.jp> References: <1317344868.2369.32.camel@Gutsy> <1317353726.2369.146.camel@Gutsy> <1317360304.4082.22.camel@Gutsy> <20110930164535.GA2286@chopin.edu.pl> <20110930213231.GB3996@chopin.edu.pl> <20111001114428.GA3428@chopin.edu.pl> <1317488301.10154.47.camel@Gutsy> <877h4oh75c.fsf@uwakimon.sk.tsukuba.ac.jp> <874nzrhkp0.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <1317587041.14397.22.camel@Gutsy> On Sun, 2011-10-02 at 17:16 +0900, Stephen J. Turnbull wrote: > Nick Coghlan writes: > > > It isn't quite - the name binding doesn't happen until *after* the > > decorator chain has been invoked, so the function is anonymous while > > the decorators are executing. > > As I understand the issue here, as far as the decorators are > concerned, the reference passed by the decorator syntax should be > enough to do any namespace manipulations that are possible in a > (non-magic) decorator. Am I missing something? I've managed to do it with a function, but it isn't pretty and isn't complete. It does work for simple cases. But it isn't easy to do... 1. Creating a dummy function with a __closure__, and taking the parts of interst from it. (Requires exec to do it.) 2. Creating a new byte code object with the needed changes. (Hard to get right) 3. Create a new code object with the altered pieces. 4. Make a new function with the new code object and __closure__ attribute. Use the original function to supply all the other parts. What you get is a function that can replace the old one, but it's a lot of work. A compile time solution would be much better, (and faster), and that is why it boils down to either special syntax, or a precompile decorator type solution. If we could make the co_code object less dependent of the cell reference objects, then a dynamic run time solution becomes realistic. But that would take rethinking the byte code to cell relationships. I don't think that is a near term option. Cheers, Ron From phd at phdru.name Sun Oct 2 22:23:55 2011 From: phd at phdru.name (Oleg Broytman) Date: Mon, 3 Oct 2011 00:23:55 +0400 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <4E88BBFC.7060705@pearwood.info> References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <4E88BBFC.7060705@pearwood.info> Message-ID: <20111002202355.GA20184@iskra.aviel.ru> On Mon, Oct 03, 2011 at 06:31:08AM +1100, Steven D'Aprano wrote: > Devin Jeanpierre wrote: > >>Since iteration over elements is at the heart of containment tests, the same > >>reasoning applies to __contains__. > > > >I was with you for the old-style sequence iteration API, but you've > >lost me here. I *can* imagine use-cases where "in" shouldn't work: > >pretty much any iterator. Doesn't it seem strange that `x in A` should > >succeed, but then `x in A` should fail? > > No. That is implied by the iterator protocol This is exactly the issue that is being discussed. In my very humble opinion classes that produce non-restartable iterators should not allow containment tests to use iterators. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From guido at python.org Sun Oct 2 22:25:46 2011 From: guido at python.org (Guido van Rossum) Date: Sun, 2 Oct 2011 13:25:46 -0700 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <20111002215743.68a3e1ab@pitrou.net> References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <4E88BBFC.7060705@pearwood.info> <20111002215743.68a3e1ab@pitrou.net> Message-ID: On Sun, Oct 2, 2011 at 12:57 PM, Antoine Pitrou wrote: > Hello? > The issue is not that walking over an iterator changes the state of the > iterator, it is that "x in A" iterates over A at all. Hello? That is the original definition of "in". -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Sun Oct 2 22:37:32 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 2 Oct 2011 16:37:32 -0400 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <20111002202355.GA20184@iskra.aviel.ru> References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <4E88BBFC.7060705@pearwood.info> <20111002202355.GA20184@iskra.aviel.ru> Message-ID: On Sun, Oct 2, 2011 at 4:23 PM, Oleg Broytman wrote: > ? This is exactly the issue that is being discussed. In my very humble > opinion classes that produce non-restartable iterators should not allow > containment tests to use iterators. And that is the part that should probably be explicitly called out as advice in PEP 8 (and perhaps in the docs themselves): iterators (as opposed to iterables) should likely override __contains__ to raise TypeError, and non-container iterables (like IO objects) should likely also be set to raise TypeError if containment tests are not well-defined for the type. Whether we adopt that advice in the standard library will need to be judged on a case by case basis, since it *would* be a breach of backwards compatibility and thus may require a DeprecationWarning period. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From arnodel at gmail.com Sun Oct 2 23:32:05 2011 From: arnodel at gmail.com (Arnaud Delobelle) Date: Sun, 2 Oct 2011 22:32:05 +0100 Subject: [Python-ideas] Tweaking closures and lexical scoping to include the function being defined In-Reply-To: <1317587041.14397.22.camel@Gutsy> References: <1317344868.2369.32.camel@Gutsy> <1317353726.2369.146.camel@Gutsy> <1317360304.4082.22.camel@Gutsy> <20110930164535.GA2286@chopin.edu.pl> <20110930213231.GB3996@chopin.edu.pl> <20111001114428.GA3428@chopin.edu.pl> <1317488301.10154.47.camel@Gutsy> <877h4oh75c.fsf@uwakimon.sk.tsukuba.ac.jp> <874nzrhkp0.fsf@uwakimon.sk.tsukuba.ac.jp> <1317587041.14397.22.camel@Gutsy> Message-ID: On 2 October 2011 21:24, Ron Adam wrote: > On Sun, 2011-10-02 at 17:16 +0900, Stephen J. Turnbull wrote: >> Nick Coghlan writes: >> >> ?> It isn't quite - the name binding doesn't happen until *after* the >> ?> decorator chain has been invoked, so the function is anonymous while >> ?> the decorators are executing. >> >> As I understand the issue here, as far as the decorators are >> concerned, the reference passed by the decorator syntax should be >> enough to do any namespace manipulations that are possible in a >> (non-magic) decorator. ?Am I missing something? > > I've managed to do it with a function, but it isn't pretty and isn't > complete. ?It does work for simple cases. > > > But it isn't easy to do... > > > 1. Creating a dummy function with a __closure__, and taking the parts of > interst from it. ?(Requires exec to do it.) > > 2. Creating a new byte code object with the needed changes. (Hard to get > right) > > 3. Create a new code object with the altered pieces. > > 4. Make a new function with the new code object and __closure__ > attribute. ?Use the original function to supply all the other parts. And you've got to take care of nested functions. That is, look for code objects in the co_consts attribute of the function's code objects and apply the same transformation (recursively). Moreover, you have to modify (recursively) all the code objects so that any MAKE_FUNCTION is changed to MAKE_CLOSURE, which involves inserting into the bytecode a code sequence to build the tuple of free variables of the closure on the stack. At this point it may be almost easier to write a general purpose code object to source code translator, stick a nonlocal declaration at the start of the source of the function, wrap it in an outer def and recompile the whole thing! A simple example to illustrate: def foo(): def bar(): return x + y return bar This compiles to: 0 LOAD_CONST 1 () 3 MAKE_FUNCTION 0 6 STORE_FAST 0 (bar) 9 LOAD_FAST 0 (bar) 12 RETURN_VALUE Now imagine we have the non-magic "nonlocal" decorator: @nonlocal(x=27, y=15) def foo(): def bar(): return x + y return bar That should compile to something like this: 0 LOAD_CLOSURE 0 (y) 3 LOAD_CLOSURE 1 (x) 6 BUILD_TUPLE 2 9 LOAD_CONST 3 () 12 MAKE_CLOSURE 0 15 STORE_FAST 0 (bar) 18 LOAD_CONST 0 (None) 21 RETURN_VALUE And obvioulsy the "bar" code object need to be adjusted (LOAD_GLOBAL -> LOAD_DEREF). -- Arnaud From greg.ewing at canterbury.ac.nz Mon Oct 3 00:00:46 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 03 Oct 2011 11:00:46 +1300 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <4D758E1A-9C84-4ABA-AEE7-6A9D7FA359B3@masklinn.net> References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <4D758E1A-9C84-4ABA-AEE7-6A9D7FA359B3@masklinn.net> Message-ID: <4E88DF0E.9080207@canterbury.ac.nz> Masklinn wrote: > since Python calls the exact same method, if you somehow do a > containment check (or an iteration) of a simple k:v collection > instead of getting a clear exception about a missing `__iter__` > or `__contains__` you get a not-very-informative `KeyError: 0` > 3 or 4 levels down the stack, and now have to hunt how in hell's > name somebody managed to call `collection[0]`. Iterators are best thought of as temporary, short-lived objects that you create when you need them and use them while they're fresh. Passing an iterator to something that is not explicitly documented as being designed for an iterator, as opposed to an iterable, is asking for trouble. It was probably a mistake not to make a clearer distinction between iterables and iterators back when the iterator protocol was designed, but we're stuck with it now. Note that this kind of problem is less likely to occur in Py3, because methods such as dict.keys() and dict.items() now return iterable views rather than iterators, so you can iterate over them multiple times without any trouble. I think this is also a good design pattern to follow when creating your own iteration-capable objects. In other words, don't write methods that return iterators directly; instead, return another object with an __iter__ method. -- Greg From tjreedy at udel.edu Mon Oct 3 05:09:05 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 02 Oct 2011 23:09:05 -0400 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <4E88BBFC.7060705@pearwood.info> <20111002215743.68a3e1ab@pitrou.net> Message-ID: On 10/2/2011 4:25 PM, Guido van Rossum wrote: > On Sun, Oct 2, 2011 at 12:57 PM, Antoine Pitrou wrote: >> Hello? >> The issue is not that walking over an iterator changes the state of the >> iterator, it is that "x in A" iterates over A at all. > > Hello? > That is the original definition of "in". I had the same reaction as Guido. Iteration is the *only* generic way to tell if an item is in a sequence or other collection . The direct hash access of sets and dicts is exceptional. The direct calculation for range is a different exception. For the other builtin sequences, and for typical iterators, which lack the information for an O(1) shortcut, 'in' (and .__contains__ if present) has to be iteration based. -- Terry Jan Reedy From tjreedy at udel.edu Mon Oct 3 05:20:30 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 02 Oct 2011 23:20:30 -0400 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <20111002202355.GA20184@iskra.aviel.ru> References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <4E88BBFC.7060705@pearwood.info> <20111002202355.GA20184@iskra.aviel.ru> Message-ID: On 10/2/2011 4:23 PM, Oleg Broytman wrote: >> No. That is implied by the iterator protocol > > This is exactly the issue that is being discussed. In my very humble > opinion classes that produce non-restartable iterators should not allow > containment tests to use iterators. I think this is backwards. Functions that take a generic iterable as input should call iter() on the input *once* and use to iterate just once, which means not using 'in'. Functions that need a re-iterable should check for the presence of .__next__. That will exclude file objects and other iterables. -- Terry Jan Reedy From tjreedy at udel.edu Mon Oct 3 05:45:19 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 02 Oct 2011 23:45:19 -0400 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: References: <20111001201320.567074bb@pitrou.net> <8A182379-B81C-4F43-810B-E139DA843E88@gmail.com> Message-ID: On 10/2/2011 1:28 PM, Guido van Rossum wrote: > On Sun, Oct 2, 2011 at 10:05 AM, Guido van Rossum wrote: >> On Sat, Oct 1, 2011 at 10:13 PM, Raymond Hettinger >> wrote: >>> >>> On Oct 1, 2011, at 2:13 PM, Antoine Pitrou wrote: >>> >>> I honestly didn't know we exposed such semantics, and I'm wondering if >>> the functionality is worth the astonishement: >>> >>> Since both __iter__ and __contains__ are deeply tied to "in-ness", >>> it isn't really astonishing that they are related. >>> For many classes, if "any(elem==obj for obj in s)" is True, >>> then "elem in s" will also be True. >>> Conversely, it isn't unreasonable to expect this code to succeed: > >>> for elem in s: >>> assert elem in s > >>> The decision to make __contains__ work whenever __iter__ is defined >>> probably goes back to Py2.2. That seems to have worked out well >>> for most users, so I don't see a reason to change that now. >> >> +1 > > Correction, I read this the way Raymond meant it, not the way he wrote > it, and hit Send too quickly. :-( > > The problem here seems to be that collections/abc.py defines Iterable > to have __iter__ but not __contains__, but the Python language defines > the 'in' operator as trying __contains__ first, and if that is not > defined, using __iter__. > > This is not surprising given Python's history, but it does cause some > confusion when one compares the ABCs with the actual behavior. I also > think that the way the ABCs have it makes more sense -- for single-use > iterables (like files) the default behavior of "in" exhausts the > iterator which is costly and fairly useless. > > Now, should we change "in" to only look for __contains__ and not fall > back on __iter__? If we were debating Python 3's feature set I would > probably agree with that, as a clean break with the past and a clear > future. Since we're debating Python 3.3, however, I think we should > just lay it to rest and use the fallback solution proposed: define > __contains__ on files to raise TypeError That would break legitimate code that uses 'in file'. The following works as stated: if 'START\n' in f: for line in f: else: There would have to be a deprecation process. But see below. > and leave the rest alone. > Maybe make a note for Python 4. Maybe add a recommendation to PEP 8 to > always implement __contains__ if you implement __iter__. [Did you mean __next__?] It seems to me better that functions that need a re-iterable non-iterator input should check for the absence of .__next__ to exclude *all* iterables, including file objects. There is no need to complicate out nice, simple, minimal iterator protocol. if hasattr(reiterable, '__next__'): raise TypeError("non-iterator required') > But let's not break existing code that depends on the current behavior > -- we have better things to do than to break perfectly fine working > code in a fit of pedantry. -- Terry Jan Reedy From tjreedy at udel.edu Mon Oct 3 05:54:58 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 02 Oct 2011 23:54:58 -0400 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <4E88DF0E.9080207@canterbury.ac.nz> References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <4D758E1A-9C84-4ABA-AEE7-6A9D7FA359B3@masklinn.net> <4E88DF0E.9080207@canterbury.ac.nz> Message-ID: On 10/2/2011 6:00 PM, Greg Ewing wrote: > It was probably a mistake not to make a clearer distinction > between iterables and iterators back when the iterator > protocol was designed, but we're stuck with it now. It is extremely useful that iterators are iterables. The distinction needed between iterators and reiterable non-iterators is easy: def reiterable(iterable): return hasattr(iterable, '__iter__') and not hasattr(iterable, '__next__') In a context where one is going to iterate (and call __iter__ if present) more than once, only the second check is needed. Functions that need a reiterable can make that check at the start to avoid a possibly obscure message attendant on failure of reiteration. -- Terry Jan Reedy From guido at python.org Mon Oct 3 05:58:48 2011 From: guido at python.org (Guido van Rossum) Date: Sun, 2 Oct 2011 20:58:48 -0700 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: References: <20111001201320.567074bb@pitrou.net> <8A182379-B81C-4F43-810B-E139DA843E88@gmail.com> Message-ID: On Sun, Oct 2, 2011 at 8:45 PM, Terry Reedy wrote: > On 10/2/2011 1:28 PM, Guido van Rossum wrote: >> The problem here seems to be that collections/abc.py defines Iterable >> to have __iter__ but not __contains__, but the Python language defines >> the 'in' operator as trying __contains__ first, and if that is not >> defined, using __iter__. >> >> This is not surprising given Python's history, but it does cause some >> confusion when one compares the ABCs with the actual behavior. I also >> think that the way the ABCs have it makes more sense -- for single-use >> iterables (like files) the default behavior of "in" exhausts the >> iterator which is costly and fairly useless. >> >> Now, should we change "in" to only look for __contains__ and not fall >> back on __iter__? If we were debating Python 3's feature set I would >> probably agree with that, as a clean break with the past and a clear >> future. Since we're debating Python 3.3, however, I think we should >> just lay it to rest and use the fallback solution proposed: define >> __contains__ on files to raise TypeError > > That would break legitimate code that uses 'in file'. > The following works as stated: > > if 'START\n' in f: > ?for line in f: > ? ? > else: > ? > > There would have to be a deprecation process. But see below. Hm. That code sample looks rather artificial. (Though now that I have seen it I can't help thinking that it might fit the bill for somebody... :-) >> and leave the rest alone. >> Maybe make a note for Python 4. Maybe add a recommendation to PEP 8 to >> always implement __contains__ if you implement __iter__. > > [Did you mean __next__?] No, I really meant __iter__. Because in Python 4 I would be okay with not using a loop as a fallback if __contains__ doesn't exist. So if in Py3 "x in a" works by using __iter__, you would have to keep it working in Py4 by defining __contains__. And no, I don't expect Py4 within this decade... > It seems to me better that functions that need a re-iterable non-iterator > input should check for the absence of .__next__ to exclude *all* iterables, > including file objects. There is no need to complicate out nice, simple, > minimal iterator protocol. > > if hasattr(reiterable, '__next__'): > ?raise TypeError("non-iterator required') That's a different issue -- you're talking about preventing bad use of __iter__ in the calling class. I was talking about supporting "in" by the defining class. >> But let's not break existing code that depends on the current behavior >> -- we have better things to do than to break perfectly fine working >> code in a fit of pedantry. Still, most people in this thread seem to agree that "x in file" works by accident, not by design, and is more likely to do harm than good, and many have in fact proposed various more serious ways of making it not work in (I presume) Py3.3. -- --Guido van Rossum (python.org/~guido) From guido at python.org Mon Oct 3 06:04:04 2011 From: guido at python.org (Guido van Rossum) Date: Sun, 2 Oct 2011 21:04:04 -0700 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <4D758E1A-9C84-4ABA-AEE7-6A9D7FA359B3@masklinn.net> <4E88DF0E.9080207@canterbury.ac.nz> Message-ID: On Sun, Oct 2, 2011 at 8:54 PM, Terry Reedy wrote: > On 10/2/2011 6:00 PM, Greg Ewing wrote: > >> It was probably a mistake not to make a clearer distinction >> between iterables and iterators back when the iterator >> protocol was designed, but we're stuck with it now. > > It is extremely useful that iterators are iterables. And I can assure you that this was no coincidence, mistake or accident. It was done very deliberately. > The distinction needed > between iterators and reiterable non-iterators is easy: > def reiterable(iterable): > ?return hasattr(iterable, '__iter__') and not hasattr(iterable, '__next__') > > In a context where one is going to iterate (and call __iter__ if present) > more than once, only the second check is needed. Functions that need a > reiterable can make that check at the start to avoid a possibly obscure > message attendant on failure of reiteration. Unfortunately most people must aren't going to learn rules like this. (Gee, even experienced Python programmers can't explain the relationship between __eq__ and __hash__ properly.) This is where ABCs would shine if they were used more pervasively -- you'd just assert that you had a Sequence (or a Collection or whatever) rather than having to make obscure hasattr checks for __dunder__ names. (And those hasattr checks aren't infallible. E.g. a class that defines __iter__ but not __next__ for its instances would incorrectly be accepted.) -- --Guido van Rossum (python.org/~guido) From greg.ewing at canterbury.ac.nz Mon Oct 3 07:20:15 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 03 Oct 2011 18:20:15 +1300 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <4E88BBFC.7060705@pearwood.info> <20111002215743.68a3e1ab@pitrou.net> Message-ID: <4E89460F.5010601@canterbury.ac.nz> Terry Reedy wrote: > I had the same reaction as Guido. Iteration is the *only* generic way to > tell if an item is in a sequence or other collection. I think the root cause of this problem is our rather cavalier attitude to the distinction between iterables and iterators. They're really quite different things, but we started out with the notion that "for x in stuff" should be equally applicable to both, and hence decided to give every iterator an __iter__ method that returns itself. By doing that, we made it impossible for any generic protocol function to reliably tell them apart. If I were designing the iterator protocol over again, I think I would start by recognising that starting a new iteration and continuing with an existing one are very different operations, and that you almost always intend the former rather than the latter. So I would declare that "for x in stuff" always implies a *new* iteration, and devise another syntax for continuing an existing one, such as "for x from stuff". I would define iterables and iterators as disjoint categories of object, and give __iter__ methods only to iterables, not iterators. However, at least until Py4k comes around, we're stuck with the present situation, which seems to include accepting that "x in y" will occasionally gobble an iterator that you were saving for later. -- Greg From guido at python.org Mon Oct 3 07:22:37 2011 From: guido at python.org (Guido van Rossum) Date: Sun, 2 Oct 2011 22:22:37 -0700 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <4E89460F.5010601@canterbury.ac.nz> References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <4E88BBFC.7060705@pearwood.info> <20111002215743.68a3e1ab@pitrou.net> <4E89460F.5010601@canterbury.ac.nz> Message-ID: On Sun, Oct 2, 2011 at 10:20 PM, Greg Ewing wrote: > Terry Reedy wrote: > >> I had the same reaction as Guido. Iteration is the *only* generic way to >> tell if an item is in a sequence or other collection. > > I think the root cause of this problem is our rather cavalier > attitude to the distinction between iterables and iterators. > They're really quite different things, but we started out with > the notion that "for x in stuff" should be equally applicable > to both, and hence decided to give every iterator an __iter__ > method that returns itself. By doing that, we made it > impossible for any generic protocol function to reliably tell > them apart. > > If I were designing the iterator protocol over again, I think > I would start by recognising that starting a new iteration > and continuing with an existing one are very different > operations, and that you almost always intend the former > rather than the latter. So I would declare that > "for x in stuff" always implies a *new* iteration, and > devise another syntax for continuing an existing one, > such as "for x from stuff". I would define iterables and > iterators as disjoint categories of object, and give __iter__ > methods only to iterables, not iterators. > > However, at least until Py4k comes around, we're stuck with > the present situation, which seems to include accepting that > "x in y" will occasionally gobble an iterator that you were > saving for later. I think that's a fine analysis. -- --Guido van Rossum (python.org/~guido) From ron3200 at gmail.com Mon Oct 3 08:11:06 2011 From: ron3200 at gmail.com (Ron Adam) Date: Mon, 03 Oct 2011 01:11:06 -0500 Subject: [Python-ideas] Tweaking closures and lexical scoping to include the function being defined In-Reply-To: References: <1317344868.2369.32.camel@Gutsy> <1317353726.2369.146.camel@Gutsy> <1317360304.4082.22.camel@Gutsy> <20110930164535.GA2286@chopin.edu.pl> <20110930213231.GB3996@chopin.edu.pl> <20111001114428.GA3428@chopin.edu.pl> <1317488301.10154.47.camel@Gutsy> <877h4oh75c.fsf@uwakimon.sk.tsukuba.ac.jp> <874nzrhkp0.fsf@uwakimon.sk.tsukuba.ac.jp> <1317587041.14397.22.camel@Gutsy> Message-ID: <1317622266.16367.28.camel@Gutsy> On Sun, 2011-10-02 at 22:32 +0100, Arnaud Delobelle wrote: > On 2 October 2011 21:24, Ron Adam wrote: > > But it isn't easy to do... > > > > > > 1. Creating a dummy function with a __closure__, and taking the parts of > > interst from it. (Requires exec to do it.) > > > > 2. Creating a new byte code object with the needed changes. (Hard to get > > right) > > > > 3. Create a new code object with the altered pieces. > > > > 4. Make a new function with the new code object and __closure__ > > attribute. Use the original function to supply all the other parts. > > And you've got to take care of nested functions. That is, look for > code objects in the co_consts attribute of the function's code objects > and apply the same transformation (recursively). Moreover, you have > to modify (recursively) all the code objects so that any MAKE_FUNCTION > is changed to MAKE_CLOSURE, which involves inserting into the bytecode > a code sequence to build the tuple of free variables of the closure on > the stack. At this point it may be almost easier to write a general > purpose code object to source code translator, stick a nonlocal > declaration at the start of the source of the function, wrap it in an > outer def and recompile the whole thing! > > A simple example to illustrate: ... > Now imagine we have the non-magic "nonlocal" decorator: > > @nonlocal(x=27, y=15) > def foo(): > def bar(): > return x + y > return bar ... I got this much to work.. def foo() @nonlocal(x=27, y=15) def bar(): return x + y return bar But nothing with nested functions. It sort of makes me want pure functions with no scope or closures by default. Cheers, Ron From greg.ewing at canterbury.ac.nz Mon Oct 3 09:22:14 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 03 Oct 2011 20:22:14 +1300 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <4D758E1A-9C84-4ABA-AEE7-6A9D7FA359B3@masklinn.net> <4E88DF0E.9080207@canterbury.ac.nz> Message-ID: <4E8962A6.5020708@canterbury.ac.nz> Guido van Rossum wrote: > And I can assure you that this was no coincidence, mistake or > accident. It was done very deliberately. I know, and I understand why it seemed like a good idea at the time. It's just that my own experiences since then have led me to think that a different choice might have worked out better. Consuming an iterator is something you really don't want to do accidentally, just like you don't want to accidentally do anything else that changes the internal state of an object. The current design makes it all too easy to do just that. Passing non-reiterable objects around is not something that I think should be encouraged. Ideally, one would hardly ever see a bare iterator -- they should be like virtual particles, coming into existence when needed, performing their function and then disappearing before anyone notices they're there. I think Py3 is heading in the right direction with things like dict.keys() returning iterable views instead of iterators. Generally, we should strive to make reiterables easier to obtain and non-reiterables harder to obtain. Maybe when we've been doing that for long enough, we'll be in a position to make fallback to iteration work only for reiterables without breaking too much code. -- Greg From greg.ewing at canterbury.ac.nz Mon Oct 3 09:27:39 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 03 Oct 2011 20:27:39 +1300 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <4D758E1A-9C84-4ABA-AEE7-6A9D7FA359B3@masklinn.net> <4E88DF0E.9080207@canterbury.ac.nz> Message-ID: <4E8963EB.5020303@canterbury.ac.nz> Terry Reedy wrote: > The distinction > needed between iterators and reiterable non-iterators is easy: > def reiterable(iterable): > return hasattr(iterable, '__iter__') and not hasattr(iterable, > '__next__') > > In a context where one is going to iterate (and call __iter__ if > present) more than once, only the second check is needed. Functions that > need a reiterable can make that check at the start to avoid a possibly > obscure message attendant on failure of reiteration. The trick will be getting people to recognise when they're requiring a reiterable, and to bother making this check when they are. Needing to perform any kind of type check on one's arguments is LBYL-ish and not very Pythonic. -- Greg From aquavitae69 at gmail.com Mon Oct 3 09:52:12 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Mon, 3 Oct 2011 09:52:12 +0200 Subject: [Python-ideas] Default return values to int and float Message-ID: My idea is fairly simple: add a "default" argument to int and float, allowing a return value if the conversion fails. E.g: >>> float('cannot convert this', default=0.0) 0.0 I think there are many use cases for this, every time float() or int() are called with data that cannot be guaranteed to be numeric, it has to be checked and some sort of default behaviour applied. The above example is just much cleaner than: try: return float(s) except ValueError: return 0.0 Any takers? David From pyideas at rebertia.com Mon Oct 3 11:57:03 2011 From: pyideas at rebertia.com (Chris Rebert) Date: Mon, 3 Oct 2011 02:57:03 -0700 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: Message-ID: On Mon, Oct 3, 2011 at 12:52 AM, David Townshend wrote: > My idea is fairly simple: add a "default" argument to int and float, > allowing a return value if the conversion fails. ?E.g: > >>>> float('cannot convert this', default=0.0) > 0.0 > > I think there are many use cases for this, every time float() or int() > are called with data that cannot be guaranteed to be numeric, it has > to be checked and some sort of default behaviour applied. ?The above > example is just much cleaner than: > > try: > ? ?return float(s) > except ValueError: > ? ?return 0.0 Important consideration: Would the default value be typechecked or not? (i.e. Does something like `float(s, {})` raise TypeError?) It's not uncommon to use None as the result value when the input is invalid, but not typechecking would then leave the door open to strangeness like my example. Or would None just be inelegantly special-cased, or...? This could be one of those instances where Python is better off leaving people to write their own short one-off functions to get the /exact/ behavior desired in their individual circumstances. Cheers, Chris From greg.ewing at canterbury.ac.nz Mon Oct 3 12:40:04 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 03 Oct 2011 23:40:04 +1300 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: Message-ID: <4E899104.6080100@canterbury.ac.nz> David Townshend wrote: > My idea is fairly simple: add a "default" argument to int and float, > allowing a return value if the conversion fails. E.g: > >>>>float('cannot convert this', default=0.0) I think I'd be more likely to want to report an error to the user than to blindly return a default value. If I did want to do this, I'd be happy to write my own function for it. It could even be made generic: def convert(text, func, default): try: return func(text) except ValueError: return default -- Greg From masklinn at masklinn.net Mon Oct 3 12:41:12 2011 From: masklinn at masklinn.net (Masklinn) Date: Mon, 3 Oct 2011 12:41:12 +0200 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: Message-ID: <6C2285B8-4301-487C-B42B-D2B1922CD24C@masklinn.net> -0 on proposal, no big judgement (although it might cause issues: if `int` and `float` can take a default, why not `dict` or `Decimal` as well?), but On 2011-10-03, at 11:57 , Chris Rebert wrote: > > Or would None just be inelegantly special-cased, or?? Why inelegantly? isinstance(default, (cls, types.NoneType)) is pretty elegant and clearly expresses the type constraint, which is an anonymous sum type[0]. Only issue is that Sphinx has no support for sum types for the moment. [0] http://en.wikipedia.org/wiki/Sum_type From _ at lvh.cc Mon Oct 3 12:42:41 2011 From: _ at lvh.cc (Laurens Van Houtven) Date: Mon, 3 Oct 2011 12:42:41 +0200 Subject: [Python-ideas] Default return values to int and float In-Reply-To: <4E899104.6080100@canterbury.ac.nz> References: <4E899104.6080100@canterbury.ac.nz> Message-ID: <2D1E945E-2700-4E1C-AD88-2EEA20BBB2E3@lvh.cc> I don't think David is arguing for the default behavior to change -- merely that you get a dict.get style default. Kinda similar to getattr/2 raising AttributeError, and getattr/3 returning the default value. cheers lvh On 03 Oct 2011, at 12:40, Greg Ewing wrote: > David Townshend wrote: >> My idea is fairly simple: add a "default" argument to int and float, >> allowing a return value if the conversion fails. E.g: >>>>> float('cannot convert this', default=0.0) > > I think I'd be more likely to want to report an error > to the user than to blindly return a default value. > > If I did want to do this, I'd be happy to write my > own function for it. > > It could even be made generic: > > def convert(text, func, default): > try: > return func(text) > except ValueError: > return default > > -- > Greg > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1306 bytes Desc: not available URL: From pyideas at rebertia.com Mon Oct 3 12:49:25 2011 From: pyideas at rebertia.com (Chris Rebert) Date: Mon, 3 Oct 2011 03:49:25 -0700 Subject: [Python-ideas] Default return values to int and float In-Reply-To: <6C2285B8-4301-487C-B42B-D2B1922CD24C@masklinn.net> References: <6C2285B8-4301-487C-B42B-D2B1922CD24C@masklinn.net> Message-ID: On Mon, Oct 3, 2011 at 3:41 AM, Masklinn wrote: > -0 on proposal, no big judgement (although it might cause issues: > if `int` and `float` can take a default, why not `dict` or > `Decimal` as well?), but > On 2011-10-03, at 11:57 , Chris Rebert wrote: >> >> Or would None just be inelegantly special-cased, or?? > Why inelegantly? There are use-cases (albeit relatively rare) for non-None null values. Special-casing NoneType would exclude those use-cases (which is a entirely reasonable trade-off option to choose). Cheers, Chris From greg.ewing at canterbury.ac.nz Mon Oct 3 12:54:23 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 03 Oct 2011 23:54:23 +1300 Subject: [Python-ideas] Default return values to int and float In-Reply-To: <2D1E945E-2700-4E1C-AD88-2EEA20BBB2E3@lvh.cc> References: <4E899104.6080100@canterbury.ac.nz> <2D1E945E-2700-4E1C-AD88-2EEA20BBB2E3@lvh.cc> Message-ID: <4E89945F.8010904@canterbury.ac.nz> Laurens Van Houtven wrote: > I don't think David is arguing for the default behavior to change -- > merely that you get a dict.get style default. I know, but experience shows that the dict.get() default is very useful in practice. I'm skeptical that the proposed feature would -- or should -- be used often enough to justify complexifying the constructor signature of int et al. The big difference as I see it is that, very often, failing to find something in a dict is not an error, but an entirely normal occurrence. On the other hand, passing something that isn't a valid int representation to int() is most likely the result of a user entering something nonsensical. In that case, the principle that "errors should not pass silently" applies. What's worse, it could become an attractive nuisance, encouraging people to mask bad input rather than provide the user with appropriate feedback. -- Greg From dirkjan at ochtman.nl Mon Oct 3 13:07:29 2011 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Mon, 3 Oct 2011 13:07:29 +0200 Subject: [Python-ideas] Default return values to int and float In-Reply-To: <4E89945F.8010904@canterbury.ac.nz> References: <4E899104.6080100@canterbury.ac.nz> <2D1E945E-2700-4E1C-AD88-2EEA20BBB2E3@lvh.cc> <4E89945F.8010904@canterbury.ac.nz> Message-ID: On Mon, Oct 3, 2011 at 12:54, Greg Ewing wrote: > The big difference as I see it is that, very often, failing > to find something in a dict is not an error, but an entirely > normal occurrence. On the other hand, passing something that > isn't a valid int representation to int() is most likely > the result of a user entering something nonsensical. In that > case, the principle that "errors should not pass silently" > applies. Hmm, not really true in my experience. Here's some actual code from my codebase at work: v = float(row[dat]) if row[dat] else 0.0 d.append(float(row[t]) if row[t] else 0.0) gen = (float(i) if i != '.' else None for i in row[1:]) limits = [(float(i) if i != '.' else None) for i in ln[5:15]] line[i] = (None if line[i] == '.' else float(line[i])) ls.append(float(row[i]) if row[i] else None) data[row['s']] = float(val) if '.' in val else int(val) cur.append(float(ln[f]) if ln[f] else None) cur.append(float(ln['DL']) if ln['DL'] else None) pv = float(ln['PV']) if ln['PV'] else None mgn = float(ln['MGN']) if ln['MGN'] else None f = lambda x: float(x) if x else 1 data[sn] += float(row['PC']) if row['PC'] else 0.0, row['PCC'] ubsc = 1 if not row['CSCALE'] else float(row['CSCALE']) scale = float(row['ESCALE']) if row['ESCALE'] else 1.0 efp = float(row['FSCALE']) if row['FSCALE'] else 1.0 convert = lambda x: float(x) if x else None In other words, this happens a lot in code where you deal with data from a third party that you want to convert to some neater structure of Python objects (in cases where a null value occurs in that data, which I would suggest is fairly common out there in the Real World). Throwing a ValueError is usually not the right thing to do here, because you still want to use all the other data that you got even if one or two values are unavailable. Converting two of the above examples: pv = float(ln['PV']) if ln['PV'] else None pv = float(ln['PV'], default=None) d.append(float(row[t]) if row[t] else 0.0) d.append(float(row[t], default=0.0)) It's a little shorter and seems easier to read to me (less repetition). Cheers, Dirkjan From dickinsm at gmail.com Mon Oct 3 14:06:48 2011 From: dickinsm at gmail.com (Mark Dickinson) Date: Mon, 3 Oct 2011 13:06:48 +0100 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: <4E899104.6080100@canterbury.ac.nz> <2D1E945E-2700-4E1C-AD88-2EEA20BBB2E3@lvh.cc> <4E89945F.8010904@canterbury.ac.nz> Message-ID: On Mon, Oct 3, 2011 at 12:07 PM, Dirkjan Ochtman wrote: > Converting two of the above examples: > > pv = float(ln['PV']) if ln['PV'] else None > pv = float(ln['PV'], default=None) > > d.append(float(row[t]) if row[t] else 0.0) > d.append(float(row[t], default=0.0)) > > It's a little shorter and seems easier to read to me (less repetition). But the two versions you give aren't equivalent. With: pv = float(ln['PV']) if ln['PV'] else None we'll get a ValueError if ln['PV'] contains some non-float, non-empty garbage value. With: pv = float(ln['PV'], default=None) and (IIUC) the proposed semantics, that garbage value will be turned into None instead, which is definitely not what I'd want to happen in normal usage. -- Mark From dirkjan at ochtman.nl Mon Oct 3 14:11:21 2011 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Mon, 3 Oct 2011 14:11:21 +0200 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: <4E899104.6080100@canterbury.ac.nz> <2D1E945E-2700-4E1C-AD88-2EEA20BBB2E3@lvh.cc> <4E89945F.8010904@canterbury.ac.nz> Message-ID: On Mon, Oct 3, 2011 at 14:06, Mark Dickinson wrote: > But the two versions you give aren't equivalent. ?With: > > ? ?pv = float(ln['PV']) if ln['PV'] else None > > we'll get a ValueError if ln['PV'] contains some non-float, non-empty > garbage value. ?With: > > ? ?pv = float(ln['PV'], default=None) > > and (IIUC) the proposed semantics, that garbage value will be turned > into None instead, which is definitely not what I'd want to happen in normal usage. Yeah, I guess you're right, and I'd definitely not want unexpected garbage values to go unnoticed. Cheers, Dirkjan From aquavitae69 at gmail.com Mon Oct 3 14:39:52 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Mon, 3 Oct 2011 14:39:52 +0200 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: Message-ID: > > Important consideration: Would the default value be typechecked or > not? (i.e. Does something like `float(s, {})` raise TypeError?) > It's not uncommon to use None as the result value when the input is > invalid, but not typechecking would then leave the door open to > strangeness like my example. > Or would None just be inelegantly special-cased, or...? > This could be one of those instances where Python is better off > leaving people to write their own short one-off functions to get the > /exact/ behavior desired in their individual circumstances. I would suggest not checking type for exactly that reason. One-off functions are fine if they are one-off, but most cases where this behaviour is needed the one-off function is exactly the same is every case. I don't think David is arguing for the default behavior to change -- merely > that you get a dict.get style default. Kinda similar to getattr/2 raising > AttributeError, and getattr/3 returning the default value. Yes, dict.get is exactly the sort of thing I was going for. I think that there are also a few other methods dotted throughout the stdlib that have this an optional "default" argument like this, so this isn't really a new idea, it's only new in as it applies to int and float. pv = float(ln['PV']) if ln['PV'] else None > pv = float(ln['PV'], default=None) I wouldn't implement it this way because of the problems already pointed out. I would use a try statement (as in my first example), which would be more robust, but which cannot be written as a one-liner. If you were to write your example in a series of try statements, it would end up four times longer and much less readable! -------------- next part -------------- An HTML attachment was scrubbed... URL: From julian at grayvines.com Mon Oct 3 15:35:49 2011 From: julian at grayvines.com (Julian Berman) Date: Mon, 3 Oct 2011 09:35:49 -0400 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: Message-ID: > Hmm, not really true in my experience. Here's some actual code from my > codebase at work: > > v = float(row[dat]) if row[dat] else 0.0 > d.append(float(row[t]) if row[t] else 0.0) > gen = (float(i) if i != '.' else None for i in row[1:]) > limits = [(float(i) if i != '.' else None) for i in ln[5:15]] > line[i] = (None if line[i] == '.' else float(line[i])) > ls.append(float(row[i]) if row[i] else None) > data[row['s']] = float(val) if '.' in val else int(val) > cur.append(float(ln[f]) if ln[f] else None) > cur.append(float(ln['DL']) if ln['DL'] else None) > pv = float(ln['PV']) if ln['PV'] else None > mgn = float(ln['MGN']) if ln['MGN'] else None > f = lambda x: float(x) if x else 1 > data[sn] += float(row['PC']) if row['PC'] else 0.0, row['PCC'] > ubsc = 1 if not row['CSCALE'] else float(row['CSCALE']) > scale = float(row['ESCALE']) if row['ESCALE'] else 1.0 > efp = float(row['FSCALE']) if row['FSCALE'] else 1.0 > convert = lambda x: float(x) if x else None > > In other words, this happens a lot in code where you deal with data > from a third party that you want to convert to some neater structure > of Python objects (in cases where a null value occurs in that data, > which I would suggest is fairly common out there in the Real World). > Throwing a ValueError is usually not the right thing to do here, > because you still want to use all the other data that you got even if > one or two values are unavailable. > > Converting two of the above examples: > > pv = float(ln['PV']) if ln['PV'] else None > pv = float(ln['PV'], default=None) > > d.append(float(row[t]) if row[t] else 0.0) > d.append(float(row[t], default=0.0)) > This is one of the cases that I typically just use logical or for if I'm expecting some nonzero but false thing, which is reasonably readable. v = float(row[t] or 0) -------------- next part -------------- An HTML attachment was scrubbed... URL: From fuzzyman at gmail.com Mon Oct 3 15:55:20 2011 From: fuzzyman at gmail.com (Michael Foord) Date: Mon, 3 Oct 2011 14:55:20 +0100 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: Message-ID: On 3 October 2011 08:52, David Townshend wrote: > My idea is fairly simple: add a "default" argument to int and float, > allowing a return value if the conversion fails. E.g: > > >>> float('cannot convert this', default=0.0) > 0.0 > > Something similar to this is pretty common in other languages. For example .NET has System.Double.TryParse http://msdn.microsoft.com/en-us/library/994c0zb1.aspx The pattern there is equivalent to returning an extra result as well as the converted value - a boolean indicating whether or not the conversion succeeded (with the "converted value" being 0.0 where conversion fails). A Python version might look like: success, value = float.parse('thing') if success: ... Part of the rational for this approach in .NET is that exception handling is very expensive, so calling TryParse is much more efficient than catching the exception if parsing fails. All the best, Michael Foord > I think there are many use cases for this, every time float() or int() > are called with data that cannot be guaranteed to be numeric, it has > to be checked and some sort of default behaviour applied. The above > example is just much cleaner than: > > try: > return float(s) > except ValueError: > return 0.0 > > > Any takers? > > David > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From massimo.dipierro at gmail.com Mon Oct 3 16:10:48 2011 From: massimo.dipierro at gmail.com (Massimo Di Pierro) Date: Mon, 3 Oct 2011 09:10:48 -0500 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: Message-ID: <6A5F53DB-1FD1-4AAF-A966-5D5442ECA1F5@gmail.com> +1 On Oct 3, 2011, at 8:55 AM, Michael Foord wrote: > > > On 3 October 2011 08:52, David Townshend wrote: > My idea is fairly simple: add a "default" argument to int and float, > allowing a return value if the conversion fails. E.g: > > >>> float('cannot convert this', default=0.0) > 0.0 > > > > Something similar to this is pretty common in other languages. For example .NET has System.Double.TryParse > > http://msdn.microsoft.com/en-us/library/994c0zb1.aspx > > The pattern there is equivalent to returning an extra result as well as the converted value - a boolean indicating whether or not the conversion succeeded (with the "converted value" being 0.0 where conversion fails). A Python version might look like: > > success, value = float.parse('thing') > if success: > ... > > Part of the rational for this approach in .NET is that exception handling is very expensive, so calling TryParse is much more efficient than catching the exception if parsing fails. > > All the best, > > Michael Foord > > I think there are many use cases for this, every time float() or int() > are called with data that cannot be guaranteed to be numeric, it has > to be checked and some sort of default behaviour applied. The above > example is just much cleaner than: > > try: > return float(s) > except ValueError: > return 0.0 > > > Any takers? > > David > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > > > -- > http://www.voidspace.org.uk/ > > May you do good and not evil > May you find forgiveness for yourself and forgive others > > May you share freely, never taking more than you give. > -- the sqlite blessing http://www.sqlite.org/different.html > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From massimo.dipierro at gmail.com Mon Oct 3 16:14:26 2011 From: massimo.dipierro at gmail.com (Massimo Di Pierro) Date: Mon, 3 Oct 2011 09:14:26 -0500 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: Message-ID: <6382D4CD-7E19-4894-8012-3EB891AC6561@gmail.com> or float("cannot convert this") or 0.0 if ValueError i.e. map x = [expression] or [value] if [exception] into try: x = [expression] except [exception] x = [value] On Oct 3, 2011, at 8:55 AM, Michael Foord wrote: > > > On 3 October 2011 08:52, David Townshend wrote: > My idea is fairly simple: add a "default" argument to int and float, > allowing a return value if the conversion fails. E.g: > > >>> float('cannot convert this', default=0.0) > 0.0 > > > > Something similar to this is pretty common in other languages. For example .NET has System.Double.TryParse > > http://msdn.microsoft.com/en-us/library/994c0zb1.aspx > > The pattern there is equivalent to returning an extra result as well as the converted value - a boolean indicating whether or not the conversion succeeded (with the "converted value" being 0.0 where conversion fails). A Python version might look like: > > success, value = float.parse('thing') > if success: > ... > > Part of the rational for this approach in .NET is that exception handling is very expensive, so calling TryParse is much more efficient than catching the exception if parsing fails. > > All the best, > > Michael Foord > > I think there are many use cases for this, every time float() or int() > are called with data that cannot be guaranteed to be numeric, it has > to be checked and some sort of default behaviour applied. The above > example is just much cleaner than: > > try: > return float(s) > except ValueError: > return 0.0 > > > Any takers? > > David > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > > > -- > http://www.voidspace.org.uk/ > > May you do good and not evil > May you find forgiveness for yourself and forgive others > > May you share freely, never taking more than you give. > -- the sqlite blessing http://www.sqlite.org/different.html > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Oct 3 16:44:51 2011 From: guido at python.org (Guido van Rossum) Date: Mon, 3 Oct 2011 07:44:51 -0700 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <4E8962A6.5020708@canterbury.ac.nz> References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <4D758E1A-9C84-4ABA-AEE7-6A9D7FA359B3@masklinn.net> <4E88DF0E.9080207@canterbury.ac.nz> <4E8962A6.5020708@canterbury.ac.nz> Message-ID: On Mon, Oct 3, 2011 at 12:22 AM, Greg Ewing wrote: > Guido van Rossum wrote: > >> And I can assure you that this was no coincidence, mistake or >> accident. It was done very deliberately. > > I know, and I understand why it seemed like a good idea at > the time. It's just that my own experiences since then have > led me to think that a different choice might have worked > out better. > > Consuming an iterator is something you really don't want to > do accidentally, just like you don't want to accidentally > do anything else that changes the internal state of an > object. The current design makes it all too easy to do > just that. > > Passing non-reiterable objects around is not something that > I think should be encouraged. Ideally, one would hardly ever > see a bare iterator -- they should be like virtual particles, > coming into existence when needed, performing their function > and then disappearing before anyone notices they're there. > > I think Py3 is heading in the right direction with things > like dict.keys() returning iterable views instead of iterators. > Generally, we should strive to make reiterables easier to > obtain and non-reiterables harder to obtain. Actually I think there is at least *some* trend in the opposite direction -- we now have a much more refined library (and even vocabulary) for "iterator algebra" than before iter() was introduced, and a subgroup of the community who can easily whip out clever ways to do things by combining iterators in new ways. > Maybe when we've been doing that for long enough, we'll be > in a position to make fallback to iteration work only for > reiterables without breaking too much code. Maybe if we had introduced new syntax to iterate over a single-use iterable from the start (*requiring* to use one form or the other depending on whether iterating over a reiterable or not), we would live in a slightly better world now, but I don't think there will ever be a time when it's easy to introduce that distinction. -- --Guido van Rossum (python.org/~guido) From matt at whoosh.ca Mon Oct 3 17:28:33 2011 From: matt at whoosh.ca (Matt Chaput) Date: Mon, 03 Oct 2011 11:28:33 -0400 Subject: [Python-ideas] startsin ? In-Reply-To: <4E867EF5.3080400@pearwood.info> References: <4E85E9AC.8040401@whoosh.ca> <4E867EF5.3080400@pearwood.info> Message-ID: <4E89D4A1.507@whoosh.ca> On 30/09/2011 10:46 PM, Steven D'Aprano wrote: > What you describe as "questionable methods" go back to the string > module, and were made methods deliberately. There's a difference between something being done deliberately and something being a good idea. > And so they should be. Riiiight, I'm sure that swapcase() has come in handy for literally zeros of people over the years. > [pedant] I'll say. If you've never heard a phrase something like "words that end in 'ing' are often gerunds", or did and complained that it wasn't "semantically correct", well... Matt From ron3200 at gmail.com Mon Oct 3 18:40:02 2011 From: ron3200 at gmail.com (Ron Adam) Date: Mon, 03 Oct 2011 11:40:02 -0500 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <4E89460F.5010601@canterbury.ac.nz> References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <4E88BBFC.7060705@pearwood.info> <20111002215743.68a3e1ab@pitrou.net> <4E89460F.5010601@canterbury.ac.nz> Message-ID: <1317660002.17299.18.camel@Gutsy> On Mon, 2011-10-03 at 18:20 +1300, Greg Ewing wrote: > Terry Reedy wrote: > > So I would declare that > "for x in stuff" always implies a *new* iteration, and > devise another syntax for continuing an existing one, > such as "for x from stuff". I like that. +1 for what ever future python it can be put in. Also, we currently don't have an InconclusiveException, which would mean; it may be True or False, but I can't tell, So handle this carefully. But it seems to me that nondeterministic results, make programmers uncomfortable. So I have a feeling that things like this would not be popular. Cheers, Ron From ncoghlan at gmail.com Mon Oct 3 19:03:43 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 3 Oct 2011 13:03:43 -0400 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: Message-ID: On Mon, Oct 3, 2011 at 5:57 AM, Chris Rebert wrote: > This could be one of those instances where Python is better off > leaving people to write their own short one-off functions to get the > /exact/ behavior desired in their individual circumstances. +1 We get into similar discussions when it comes to higher order itertools. Eventually, there are enough subtle variations that it becomes a better option to let users write their own utility functions. In this case, there are at least 2 useful variants: def convert(target_type, obj, default) if obj: return target_type(obj) return default def try_convert(target_type, obj, default, ignored=(TypeError,)) try: return target_type(obj) except ignored: return default Exceptions potentially ignored include TypeError, AttributeError and ValueError. However, you may also want to include logging so that you can go through the logs later to find data that needs cleaning. Except even in Dirkjan's example code, we see cases that don't fit either model (they're comparing against a *particular* value and otherwise just calling float()). The question is whether there is a useful alternative that is as general purpose as getattr/3 (which ignores AttributeError) and dict.get/2 (which ignores KeyError). However, I think there are too many variations in conversion function APIs and the way are used for that to be of sufficiently general use - we'd be adding something new that everyone has to learn, but far too much of the time they'd have to do their own conversion anyway. I'd sooner see a getitem/3 builtin that could be used to ignore any LookupError the way dict.get/3 allows KeyError to be ignored. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From greg.ewing at canterbury.ac.nz Mon Oct 3 22:52:28 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 04 Oct 2011 09:52:28 +1300 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: <4E899104.6080100@canterbury.ac.nz> <2D1E945E-2700-4E1C-AD88-2EEA20BBB2E3@lvh.cc> <4E89945F.8010904@canterbury.ac.nz> Message-ID: <4E8A208C.6080706@canterbury.ac.nz> Dirkjan Ochtman wrote: > Hmm, not really true in my experience. Here's some actual code from my > codebase at work: > > v = float(row[dat]) if row[dat] else 0.0 > d.append(float(row[t]) if row[t] else 0.0) > gen = (float(i) if i != '.' else None for i in row[1:]) This is different. You're looking for a particular value (such as an empty string or None) and treating it as equivalent to zero. That's not what the OP suggested -- he wants *any* invalid string to return the default value. That's analogous to using a bare except instead of catching a particular exception. -- Greg From greg.ewing at canterbury.ac.nz Mon Oct 3 23:07:51 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 04 Oct 2011 10:07:51 +1300 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <4D758E1A-9C84-4ABA-AEE7-6A9D7FA359B3@masklinn.net> <4E88DF0E.9080207@canterbury.ac.nz> <4E8962A6.5020708@canterbury.ac.nz> Message-ID: <4E8A2427.7070603@canterbury.ac.nz> Guido van Rossum wrote: > Actually I think there is at least *some* trend in the opposite > direction -- we now have a much more refined library (and even > vocabulary) for "iterator algebra" than before iter() was introduced, > and a subgroup of the community who can easily whip out clever ways to > do things by combining iterators in new ways. Hmmm, not sure what to do about that. Maybe we should be thinking about a "reiterator algebra" to sit on top of the iterator algebra. For example, given two reiterables x and y, zip(x, y) would return a reiterable that, when iterated over, would extract iterators from x and y and return a corresponding iterator. -- Greg From greg.ewing at canterbury.ac.nz Mon Oct 3 23:28:03 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 04 Oct 2011 10:28:03 +1300 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <1317660002.17299.18.camel@Gutsy> References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <4E88BBFC.7060705@pearwood.info> <20111002215743.68a3e1ab@pitrou.net> <4E89460F.5010601@canterbury.ac.nz> <1317660002.17299.18.camel@Gutsy> Message-ID: <4E8A28E3.2090309@canterbury.ac.nz> Ron Adam wrote: > Also, we currently don't have an InconclusiveException, which would > mean; it may be True or False, but I can't tell, So handle this > carefully. Um, what does that have to do with iterators? -- Greg From ncoghlan at gmail.com Tue Oct 4 00:00:16 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 3 Oct 2011 18:00:16 -0400 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <4E8A2427.7070603@canterbury.ac.nz> References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <4D758E1A-9C84-4ABA-AEE7-6A9D7FA359B3@masklinn.net> <4E88DF0E.9080207@canterbury.ac.nz> <4E8962A6.5020708@canterbury.ac.nz> <4E8A2427.7070603@canterbury.ac.nz> Message-ID: On Mon, Oct 3, 2011 at 5:07 PM, Greg Ewing wrote: > Guido van Rossum wrote: > >> Actually I think there is at least *some* trend in the opposite >> direction -- we now have a much more refined library (and even >> vocabulary) for "iterator algebra" than before iter() was introduced, >> and a subgroup of the community who can easily whip out clever ways to >> do things by combining iterators in new ways. > > Hmmm, not sure what to do about that. > > Maybe we should be thinking about a "reiterator algebra" > to sit on top of the iterator algebra. > > For example, given two reiterables x and y, zip(x, y) > would return a reiterable that, when iterated over, would > extract iterators from x and y and return a corresponding > iterator. Can't be done in general, since one of the main points of iterator algebra is that it works with *infinite* iterators. Basically, my understanding is that iterators started life as a way of generalising various operations on containers. This is reflected in the "for x in y: assert x in y" symmetry currently ensured by the respective definitions of the two variants of 'in'. Once the iterator protocol existed, though, people realised it made possible certain things that containers can't do (such as operating on theoretically infinite data sets or data sets that won't fit in RAM all at once). The two domains are essentially disjoint (one assumes reiterability and the ability to load the whole data set into memory, while the latter denies both of those assumptions as invalid), but they share a protocol and syntax. Often, list(itr) is used to ensure the first set of assumptions holds true, while the latter is handled by carefully ensure to iterate only once over supplied iterators and using tools like itertools.tee() to preserve any state needed. I'm not sure it's actually feasible to separate the two domains cleanly at this late stage of the game, although the collections.Container ABC may get us started down that path if we explicitly register the various builtin containers with it (a duck-typed check for __contains__ would break for classes that explicitly raise TypeError, as is proposed for _BaseIO). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From arnodel at gmail.com Tue Oct 4 00:02:54 2011 From: arnodel at gmail.com (Arnaud Delobelle) Date: Mon, 3 Oct 2011 23:02:54 +0100 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <4E8A2427.7070603@canterbury.ac.nz> References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <4D758E1A-9C84-4ABA-AEE7-6A9D7FA359B3@masklinn.net> <4E88DF0E.9080207@canterbury.ac.nz> <4E8962A6.5020708@canterbury.ac.nz> <4E8A2427.7070603@canterbury.ac.nz> Message-ID: On 3 October 2011 22:07, Greg Ewing wrote: > Maybe we should be thinking about a "reiterator algebra" > to sit on top of the iterator algebra. > > For example, given two reiterables x and y, zip(x, y) > would return a reiterable that, when iterated over, would > extract iterators from x and y and return a corresponding > iterator. I'm not sure I understand. Do you mean something like this snippet below, but only in the case where the contents of "args" are all "reiterable"? >>> class Zip: ... def __init__(self, *args): ... self.args = args ... def __iter__(self): ... return zip(*self.args) ... >>> z = Zip('ab', 'cd') >>> list(z) [('a', 'c'), ('b', 'd')] >>> list(z) [('a', 'c'), ('b', 'd')] -- Arnaud From masklinn at masklinn.net Tue Oct 4 00:13:39 2011 From: masklinn at masklinn.net (Masklinn) Date: Tue, 4 Oct 2011 00:13:39 +0200 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <4D758E1A-9C84-4ABA-AEE7-6A9D7FA359B3@masklinn.net> <4E88DF0E.9080207@canterbury.ac.nz> <4E8962A6.5020708@canterbury.ac.nz> <4E8A2427.7070603@canterbury.ac.nz> Message-ID: On 2011-10-04, at 00:00 , Nick Coghlan wrote: > > Basically, my understanding is that iterators started life as a way of > generalising various operations on containers. This is reflected in > the "for x in y: assert x in y" symmetry Which interestingly enough will *not* work for iterables without repeating elements. From arnodel at gmail.com Tue Oct 4 00:20:51 2011 From: arnodel at gmail.com (Arnaud Delobelle) Date: Mon, 3 Oct 2011 23:20:51 +0100 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <4D758E1A-9C84-4ABA-AEE7-6A9D7FA359B3@masklinn.net> <4E88DF0E.9080207@canterbury.ac.nz> <4E8962A6.5020708@canterbury.ac.nz> <4E8A2427.7070603@canterbury.ac.nz> Message-ID: On 3 October 2011 23:00, Nick Coghlan wrote: >> Maybe we should be thinking about a "reiterator algebra" >> to sit on top of the iterator algebra. >> >> For example, given two reiterables x and y, zip(x, y) >> would return a reiterable that, when iterated over, would >> extract iterators from x and y and return a corresponding >> iterator. > > Can't be done in general, since one of the main points of iterator > algebra is that it works with *infinite* iterators. But an iterable doesn't have to be finite to be reiterable: >>> class A: ... def __iter__(self): ... return itertools.count() ... >>> a = A() >>> list(zip(a, range(10))) [(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (7, 7), (8, 8), (9, 9)] >>> list(zip(a, range(10))) [(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (7, 7), (8, 8), (9, 9)] This makes me think that you could have a process of "lifting" an iterator-making function like zip back to iterable-making: >>> class deiter: ... def __init__(self, f, *args, **kwargs): ... self.f = f ... self.args = args ... self.kwargs = kwargs ... def __iter__(self): ... return iter(self.f(*self.args, **self.kwargs)) >>> z = deiter(zip, a, range(10)) >>> list(z) [(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (7, 7), (8, 8), (9, 9)] >>> list(z) [(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (7, 7), (8, 8), (9, 9)] -- Arnaud From zuo at chopin.edu.pl Tue Oct 4 01:24:03 2011 From: zuo at chopin.edu.pl (Jan Kaliszewski) Date: Tue, 4 Oct 2011 01:24:03 +0200 Subject: [Python-ideas] getitem(obj, key, default) [was: Default return values to int and float] In-Reply-To: References: Message-ID: <20111003232403.GA2262@chopin.edu.pl> Nick Coghlan dixit (2011-10-03, 13:03): > I'd sooner see a getitem/3 builtin > that could be used to ignore any LookupError the way dict.get/3 allows > KeyError to be ignored. +1. It's probably quite common case. Regards. *j From zuo at chopin.edu.pl Tue Oct 4 01:47:04 2011 From: zuo at chopin.edu.pl (Jan Kaliszewski) Date: Tue, 4 Oct 2011 01:47:04 +0200 Subject: [Python-ideas] Tweaking closures and lexical scoping to include the function being defined In-Reply-To: References: <1317360304.4082.22.camel@Gutsy> <20110930164535.GA2286@chopin.edu.pl> <20110930213231.GB3996@chopin.edu.pl> <20111001114428.GA3428@chopin.edu.pl> <1317488301.10154.47.camel@Gutsy> <877h4oh75c.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <20111003234704.GB2262@chopin.edu.pl> Nick Coghlan dixit (2011-10-01, 22:11): > Now is better than never. > Although never is often better than *right* now. > - The status quo has served us well for a long time. If someone can > come up with an elegant syntax, great, let's pursue it. I believe that both syntax propositions: def spam(x, some, arguments, foo_bar=997) [variable=1, lock=Lock()]: nonlocal variable with lock: variable += x return variable + foo_bar and @(variable=1, lock=Lock()) def spam(x, some, arguments, foo_bar=997): nonlocal variable with lock: variable += x return variable + foo_bar -- are quite elegant and each of them would be nice and useful equivalent of: def _temp(): variable = 1 lock = Lock() def spam(x, some, arguments, foo_bar=997): nonlocal variable with lock: variable += x return variable + foo_bar return spam spam = _temp() del _temp > Otherwise, > this whole issue really isn't that important in the grand scheme of > things (although a PEP to capture the current 'state of the art' > thinking on the topic would still be nice - I believe Jan and Eric > still plan to get to that once the discussion dies down again) Yes, I already started creating such a summary, though this thread changed a lot (at least in my mind -- about the subject). As I had reserved I cannot promise to do it quickly (because of other activities). Cheers. *j From zuo at chopin.edu.pl Tue Oct 4 02:09:29 2011 From: zuo at chopin.edu.pl (Jan Kaliszewski) Date: Tue, 4 Oct 2011 02:09:29 +0200 Subject: [Python-ideas] __contains__=None (was: __iter__ implies __contains__?) In-Reply-To: References: <4E87BAD9.2070501@pearwood.info> <4D758E1A-9C84-4ABA-AEE7-6A9D7FA359B3@masklinn.net> <4E88DF0E.9080207@canterbury.ac.nz> <4E8962A6.5020708@canterbury.ac.nz> <4E8A2427.7070603@canterbury.ac.nz> Message-ID: <20111004000929.GA3283@chopin.edu.pl> Nick Coghlan dixit (2011-10-03, 18:00): [...] > duck-typed check for __contains__ would break for classes that > explicitly raise TypeError, as is proposed for _BaseIO). Maybe there should be possible to explicitly disallow the 'in' test by setting __contains__ to None (similar to already settled __hash__=None for non-hashables). Cheers. *j From ron3200 at gmail.com Tue Oct 4 02:18:12 2011 From: ron3200 at gmail.com (Ron Adam) Date: Mon, 03 Oct 2011 19:18:12 -0500 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <4E8A28E3.2090309@canterbury.ac.nz> References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <4E88BBFC.7060705@pearwood.info> <20111002215743.68a3e1ab@pitrou.net> <4E89460F.5010601@canterbury.ac.nz> <1317660002.17299.18.camel@Gutsy> <4E8A28E3.2090309@canterbury.ac.nz> Message-ID: <1317687492.19267.24.camel@Gutsy> On Tue, 2011-10-04 at 10:28 +1300, Greg Ewing wrote: > Ron Adam wrote: > > > Also, we currently don't have an InconclusiveException, which would > > mean; it may be True or False, but I can't tell, So handle this > > carefully. > > Um, what does that have to do with iterators? In Regards to the "in" operator, when "in" can't be used on an iterator either because it's infinite, or would cause undesirable side effects. Also cases where an iterator is already partially consumed, that doesn't mean the value is not in the object being iterated, It's just no longer in the iterator. >>> a = 'python' >>> 'p' in a True >>> 'p' in a True >>> b = iter(a) >>> 'p' in b True >>> 'p' in b False Not really suggesting we have such an exception. For such a thing to work, it would require adding more state information to iterators. And of course there's nothing stopping anyone from writing there own class, and exception, if that type of feature is useful for them. Cheers, Ron From raymond.hettinger at gmail.com Tue Oct 4 04:04:56 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Mon, 3 Oct 2011 22:04:56 -0400 Subject: [Python-ideas] getitem(obj, key, default) [was: Default return values to int and float] In-Reply-To: <20111003232403.GA2262@chopin.edu.pl> References: <20111003232403.GA2262@chopin.edu.pl> Message-ID: <97D1FD2F-092F-47A3-9F85-799895091477@gmail.com> On Oct 3, 2011, at 7:24 PM, Jan Kaliszewski wrote: > Nick Coghlan dixit (2011-10-03, 13:03): > >> I'd sooner see a getitem/3 builtin >> that could be used to ignore any LookupError the way dict.get/3 allows >> KeyError to be ignored. > > +1. > > It's probably quite common case. How many times does this silly idea have to get shot down? Do you see other languages implementing get defaults on sequences? Do you see lots of python users implementing this in a util module because it is an important operation? Can you find examples of real-world code that would be significantly improved with list.get() functionality? Does this make any semantic sense to users (i.e. they specifically want to the i-th item of sequence but don't even know long the sequence is)? Refuse to hypergeneralize dict.get() into a context where it doesn't make sense (it does make sense for mappings, but not for sequences; sequence indices are all about position while mapping keys have deeper relationship to the corresponding values). Raymond From raymond.hettinger at gmail.com Tue Oct 4 04:35:08 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Mon, 3 Oct 2011 22:35:08 -0400 Subject: [Python-ideas] __contains__=None (was: __iter__ implies __contains__?) In-Reply-To: <20111004000929.GA3283@chopin.edu.pl> References: <4E87BAD9.2070501@pearwood.info> <4D758E1A-9C84-4ABA-AEE7-6A9D7FA359B3@masklinn.net> <4E88DF0E.9080207@canterbury.ac.nz> <4E8962A6.5020708@canterbury.ac.nz> <4E8A2427.7070603@canterbury.ac.nz> <20111004000929.GA3283@chopin.edu.pl> Message-ID: <7AD5B753-BF62-497E-9235-D4BCE64AC7C1@gmail.com> On Oct 3, 2011, at 8:09 PM, Jan Kaliszewski wrote: > Maybe there should be possible to explicitly disallow the 'in' test by > setting __contains__ to None (similar to already settled __hash__=None > for non-hashables). We need fewer arcane rules -- not more of them. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Oct 4 04:48:02 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 3 Oct 2011 22:48:02 -0400 Subject: [Python-ideas] getitem(obj, key, default) [was: Default return values to int and float] In-Reply-To: <97D1FD2F-092F-47A3-9F85-799895091477@gmail.com> References: <20111003232403.GA2262@chopin.edu.pl> <97D1FD2F-092F-47A3-9F85-799895091477@gmail.com> Message-ID: On Mon, Oct 3, 2011 at 10:04 PM, Raymond Hettinger wrote: > Does this make any semantic sense to users (i.e. they specifically want to the i-th item of sequence but don't even know long the sequence is)? I still occasionally want to do it with sys.argv to implement optional positional arguments before I remind myself to quit messing around reinventing the wheel and just import argparse. But yeah, being a better idea than the conversion function proposals is a far cry from being a good idea :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From raymond.hettinger at gmail.com Tue Oct 4 04:54:02 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Mon, 3 Oct 2011 22:54:02 -0400 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <4E8962A6.5020708@canterbury.ac.nz> References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <4D758E1A-9C84-4ABA-AEE7-6A9D7FA359B3@masklinn.net> <4E88DF0E.9080207@canterbury.ac.nz> <4E8962A6.5020708@canterbury.ac.nz> Message-ID: On Oct 3, 2011, at 3:22 AM, Greg Ewing wrote: > Passing non-reiterable objects around is not something that > I think should be encouraged. Really? Passing around iterators is a basic design pattern http://en.wikipedia.org/wiki/Iterator_pattern for lots of languages. You may have a personal programming style that avoids iterators, but that shouldn't be forced on the rest of the community. The only way for a broad categories of iterators to become reiterable is for their outputs to be stored in memory (thus defeating the just-in-time memory conserving property of many iterators). Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Oct 4 05:34:29 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 3 Oct 2011 23:34:29 -0400 Subject: [Python-ideas] __contains__=None (was: __iter__ implies __contains__?) In-Reply-To: <7AD5B753-BF62-497E-9235-D4BCE64AC7C1@gmail.com> References: <4E87BAD9.2070501@pearwood.info> <4D758E1A-9C84-4ABA-AEE7-6A9D7FA359B3@masklinn.net> <4E88DF0E.9080207@canterbury.ac.nz> <4E8962A6.5020708@canterbury.ac.nz> <4E8A2427.7070603@canterbury.ac.nz> <20111004000929.GA3283@chopin.edu.pl> <7AD5B753-BF62-497E-9235-D4BCE64AC7C1@gmail.com> Message-ID: On Mon, Oct 3, 2011 at 10:35 PM, Raymond Hettinger wrote: > > On Oct 3, 2011, at 8:09 PM, Jan Kaliszewski wrote: > > Maybe there should be possible to explicitly disallow the 'in' test by > setting __contains__ to None (similar to already settled __hash__=None > for non-hashables). > > We need fewer arcane rules -- not more of them. Indeed, __hash__ = None was a special case forced on us by the fact that object defines __hash__ and the Hashable ABC was implemented to use a duck typed instance check. The interpreter needed a way to disable hashing when users defined a custom __eq__ implementation without overriding __hash__ themselves. I suspect isinstance(obj, collections.Container) is already an adequate test to separate out "real" containers from mere iterators, and ferreting out subtle distinctions like that where duck typing isn't up to the task is one of the main reasons ABCs were added. (I earlier indicated I didn't think that was the case yet, but I subsequently realised that was due to my using isinstance() to check things at the interactive prompt when I should have been using issubclass()) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From cmjohnson.mailinglist at gmail.com Tue Oct 4 06:00:01 2011 From: cmjohnson.mailinglist at gmail.com (Carl Matthew Johnson) Date: Mon, 3 Oct 2011 18:00:01 -1000 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: Message-ID: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> On Oct 3, 2011, at 7:03 AM, Nick Coghlan wrote: > > def try_convert(target_type, obj, default, ignored=(TypeError,)) > try: > return target_type(obj) > except ignored: > return default This reminds me of the string.index vs. string.find discussion we had a while back. In basically any situation where an exception can be raised, it's sometimes nice to return a None-like value and sometimes nice to have an out-of-band exception. I have a certain amount of admiration for the pattern in Go of returning (value, error) from most functions that might have an error, but for Python as it is today, there's no One Obvious Way to Do It yet, and there's probably none forthcoming. A slightly more generalized form of try_convert might be useful (see below), but then again, we can't just pack every possible 5 line function into the standard library? > def catch(exception, f, *args, kwargs={}, default=None) > try: > return f(*args, **kwargs) > except exception: > return default >>> catch(ValueError, "abc".index, "z", default="Not Found") 'Not Found' >>> catch(ValueError, float, "zero", default=0.0) 0.0 From greg.ewing at canterbury.ac.nz Tue Oct 4 06:59:26 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 04 Oct 2011 17:59:26 +1300 Subject: [Python-ideas] getitem(obj, key, default) [was: Default return values to int and float] In-Reply-To: <20111003232403.GA2262@chopin.edu.pl> References: <20111003232403.GA2262@chopin.edu.pl> Message-ID: <4E8A92AE.3030808@canterbury.ac.nz> Jan Kaliszewski wrote: > Nick Coghlan dixit (2011-10-03, 13:03): > >>I'd sooner see a getitem/3 builtin >>that could be used to ignore any LookupError the way dict.get/3 allows >>KeyError to be ignored. Well, maybe. Some useful properties of dict.get() are that it avoids the overhead of catching an exception, and there is no danger of catching something coming from somewhere other than the indexing operation itself. You wouldn't get those from a generic version. -- Greg From greg.ewing at canterbury.ac.nz Tue Oct 4 08:30:32 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 04 Oct 2011 19:30:32 +1300 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <4D758E1A-9C84-4ABA-AEE7-6A9D7FA359B3@masklinn.net> <4E88DF0E.9080207@canterbury.ac.nz> <4E8962A6.5020708@canterbury.ac.nz> Message-ID: <4E8AA808.9090308@canterbury.ac.nz> Raymond Hettinger wrote: > Really? Passing around iterators is a basic design pattern > http://en.wikipedia.org/wiki/Iterator_pattern for lots of languages. I'm not suggesting that we stop using iterators altogether, only that reiterables are often preferable when there's a choice. Passing a reiterable to a piece of third-party library code is safer and more future-proof than passing an iterator, because it makes less assumptions about what will be done to it. There are certainly some objects that are inherently non-reiterable, such as file objects reading from pipes or sockets, but there are many others that *are* reiterable. Some of them, such as itertools.count(), are currently only available as iterators, but could just as easily be made available in a reiterable version. And the deiter() function posted earlier shows that it's always possible to construct a reiterable analogue of any iterator-algebra operator, provided you have reiterable base objects to work with. > You may have a personal programming style that avoids > iterators, but that shouldn't be forced on the rest of the > community. I don't mean to force it, but to make it at least as easy to use a reiterable-based style as an iterator-based one wherever it's reasonably possible. Ideally, reiterables should be the most obvious (in the Dutch sense) choice, with iterators being the next-most-obvious choice for when you can't use reiterables. > The only way for a broad categories of iterators to become > reiterable is for their outputs to be stored in memory I'm not sure the category is as broad as all that. Note that it does *not* include infinite iterables whose elements are generated by an algorithm, such as itertools.count(). It doesn't even include disk files, however large they might be, since you can in principle open another stream reading from the same file (although the traditional way of manifesting files as objects doesn't make that as straightforward as it could be). -- Greg From aquavitae69 at gmail.com Tue Oct 4 08:58:31 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Tue, 4 Oct 2011 08:58:31 +0200 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: Message-ID: > > def try_convert(target_type, obj, default, ignored=(TypeError,)) > try: > return target_type(obj) > except ignored: > return default > The problem with a general convert function is that to make it work, you would need to account for several variations and the signature gets rather clunky. Personally, I think that the try format: try: return float('some text') except ValueError: return 42 is more readable than try_convert('some text', float, 42, (ValueError,)) because it is clear what it does. The second form is shorter, but not as descriptive. However, float('some text', default=42) follows the existing syntax quite nicely, and is more readable than either of the other options. A generalised try_convert method would be useful, but I think I would rather see a one-line version of the try statements, perhaps something like this: x = try float('some text') else 42 if ValueError -------------- next part -------------- An HTML attachment was scrubbed... URL: From masklinn at masklinn.net Tue Oct 4 09:37:29 2011 From: masklinn at masklinn.net (Masklinn) Date: Tue, 4 Oct 2011 09:37:29 +0200 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: Message-ID: On 2011-10-04, at 08:58 , David Townshend wrote: >> >> def try_convert(target_type, obj, default, ignored=(TypeError,)) >> try: >> return target_type(obj) >> except ignored: >> return default >> > > The problem with a general convert function is that to make it work, you > would need to account for several variations and the signature gets rather > clunky. Personally, I think that the try format: > > try: > return float('some text') > except ValueError: > return 42 > > is more readable than > > try_convert('some text', float, 42, (ValueError,)) > > because it is clear what it does. The second form is shorter, but not as > descriptive. However, > > float('some text', default=42) > > follows the existing syntax quite nicely, and is more readable than either > of the other options. > > A generalised try_convert method would be useful, but I think I would rather > see a one-line version of the try statements, perhaps something like this: > > x = try float('some text') else 42 if ValueError That's basically what the function you've rejected does (you got the arguments order wrong): x = try_convert(float, 'some text', default=42, ignored=ValueError) Just rename an argument or two and you have the exact same thing. From aquavitae69 at gmail.com Tue Oct 4 10:41:18 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Tue, 4 Oct 2011 10:41:18 +0200 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: Message-ID: On Tue, Oct 4, 2011 at 9:37 AM, Masklinn wrote: > > On 2011-10-04, at 08:58 , David Townshend wrote: > > >> > >> def try_convert(target_type, obj, default, ignored=(TypeError,)) > >> try: > >> return target_type(obj) > >> except ignored: > >> return default > >> > > > > The problem with a general convert function is that to make it work, you > > would need to account for several variations and the signature gets > rather > > clunky. Personally, I think that the try format: > > > > try: > > return float('some text') > > except ValueError: > > return 42 > > > > is more readable than > > > > try_convert('some text', float, 42, (ValueError,)) > > > > because it is clear what it does. The second form is shorter, but not as > > descriptive. However, > > > > float('some text', default=42) > > > > follows the existing syntax quite nicely, and is more readable than > either > > of the other options. > > > > A generalised try_convert method would be useful, but I think I would > rather > > see a one-line version of the try statements, perhaps something like > this: > > > > x = try float('some text') else 42 if ValueError > That's basically what the function you've rejected does (you got the > arguments order wrong): > > x = try_convert(float, 'some text', default=42, ignored=ValueError) > > Just rename an argument or two and you have the exact same thing. > > Same functionality, but try_convert is a function with lots of arguments whereas my alternative is an expression. But to be honest, I don't really like either. In cases that require the level of control that try_convert provides, the try statement is cleaner. The point I'm really trying to make is that my initial proposal was for a specific but common use case (float and int), not a general-purpose conversion tool. -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Tue Oct 4 10:51:00 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 4 Oct 2011 09:51:00 +0100 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: Message-ID: On 4 October 2011 09:41, David Townshend wrote: > Same functionality, but try_convert is a function with lots of arguments > whereas my alternative is an expression. ?But to be honest, I don't really > like either. ?In cases that require the level of control that try_convert > provides, the try statement is cleaner. ?The point I'm really trying to make > is that my initial proposal was for a specific but common use case (float > and int), not a general-purpose conversion tool. I think the point you're missing is that most people here don't see using a default in place of garbage input (as opposed to just for empty input) as a "common" use case. Certainly not common enough to warrant a language change rather than a private utility function... Paul. From ronaldoussoren at mac.com Tue Oct 4 14:22:45 2011 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Tue, 04 Oct 2011 14:22:45 +0200 Subject: [Python-ideas] Changing semantics of for-loop variable In-Reply-To: References: <4E802202.4080009@canterbury.ac.nz> <4E8159AB.2030506@canterbury.ac.nz> <465D00FC-D144-4606-AF90-2F5014C4ED3A@gmail.com> <4E817549.7080107@canterbury.ac.nz> <4E84E3C7.2030102@canterbury.ac.nz> Message-ID: On 30 Sep, 2011, at 16:17, Jim Jewett wrote: > On Fri, Sep 30, 2011 at 1:06 AM, Terry Reedy wrote: >> On 9/29/2011 5:31 PM, Greg Ewing wrote: > >>> If the loop variable is referenced from an inner scope, >>> instead of replacing the contents of its cell, create >>> a *new* cell on each iteration. > >> Since loop variables do not normally have cells, I really do not understand >> from this what you are proposing. What I do understand is that you would >> have the content of the body of a loop change the behavior of the loop. > > Not really. If the loop (or other) variable is not used in creating > closures, then it doesn't really matter whether the variable is stored > as a local or a cell -- it happens to be stored as a local for speed > reasons. > > If a variable (including the loop variable) is used inside a closure, > then it already creates a cell. This means that loop content already > affects the way the loop is compiled, though again, it affects only > efficiency, not semantics. > > The difference is that now the loop will create N separate cells > instead of just one, so that each closure will see its own private > variable, instead of all sharing the same one. That is a semantic > difference, but if the closed-over variable is also a loop variable, > it will normally be a bugfix. What worries me with this proposal is that it only affects the loop variable, and not other variables. This makes it easy to introduce subtle bugs when you forget this. I often have code like this when using closures in a loop: for val in sequence: other = lookup(val) result.append(lambda val=val, other=other): doit(val, other)) Greg's proposal means that 'val=val' would no longer be needed, but you'd still need to use the default argument trick for other variables. The difference between the behavior for the loop variable and other variables is also relatively hard to explain. Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2224 bytes Desc: not available URL: From jason.orendorff at gmail.com Tue Oct 4 16:02:54 2011 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Tue, 4 Oct 2011 09:02:54 -0500 Subject: [Python-ideas] Changing semantics of for-loop variable In-Reply-To: References: <4E802202.4080009@canterbury.ac.nz> <4E8159AB.2030506@canterbury.ac.nz> <465D00FC-D144-4606-AF90-2F5014C4ED3A@gmail.com> <4E817549.7080107@canterbury.ac.nz> <4E84E3C7.2030102@canterbury.ac.nz> Message-ID: On Tue, Oct 4, 2011 at 7:22 AM, Ronald Oussoren wrote: > What worries me with this proposal is that it only affects the loop variable, and not other variables. Right. Another similar example: def f1(seq): for i in seq: yield lambda: i.name # each lambda refers to a different i def f2(seq): for i in seq: x = i.name yield lambda: x # each lambda refers to the same local x Think of f2 as a refactoring of f1. I think the programmer has a right to expect that to work, but it would change behavior in a really surprising way. Also: What if the loop variable is also used elsewhere in the function? def f(seq): i = None for i in seq: pass return i Does the loop variable get a cell in this case? If so, I guess it must be a different variable from the local i. So would f([1]) return None? That would be a rather astonishing change in behavior! So perhaps the loop variable cell would be kept in sync with the local variable during the loop body? It all seems too weird. -j From ncoghlan at gmail.com Tue Oct 4 16:35:02 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 4 Oct 2011 10:35:02 -0400 Subject: [Python-ideas] Changing semantics of for-loop variable In-Reply-To: References: <4E802202.4080009@canterbury.ac.nz> <4E8159AB.2030506@canterbury.ac.nz> <465D00FC-D144-4606-AF90-2F5014C4ED3A@gmail.com> <4E817549.7080107@canterbury.ac.nz> <4E84E3C7.2030102@canterbury.ac.nz> Message-ID: > What worries me with this proposal is that it only affects the loop variable, and not other variables. This makes it easy to introduce subtle bugs when you forget this. I often have code like this when using closures in a loop: > > ? ?for val in sequence: > ? ? ? ?other = lookup(val) > ? ? ? ?result.append(lambda val=val, other=other): doit(val, other)) > > Greg's proposal means that 'val=val' would no longer be needed, but you'd still need to use the default argument trick for other variables. The difference between the behavior for the loop variable and other variables is also relatively hard to explain. The proposal actually wasn't clear whether it affected all name bindings in loops or just the iteration variables in for loops. Either way, the fact that unrolling the loops (even partially to implement loop-and-a-half without using break) fundamentally changed the name binding semantics pretty much killed the idea from my point of view. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From jxo6948 at rit.edu Tue Oct 4 18:03:35 2011 From: jxo6948 at rit.edu (John O'Connor) Date: Tue, 4 Oct 2011 12:03:35 -0400 Subject: [Python-ideas] Add peekline(), peeklines(n) and optional maxlines argument to readlines() In-Reply-To: References: <20110930180019.466ebbdb@pitrou.net> Message-ID: I thought of a somewhat elegant recipe for the readlines(maxlines) case: f = open(...)itertools.islice(f, maxlines) - John From jxo6948 at rit.edu Tue Oct 4 18:04:55 2011 From: jxo6948 at rit.edu (John O'Connor) Date: Tue, 4 Oct 2011 12:04:55 -0400 Subject: [Python-ideas] Add peekline(), peeklines(n) and optional maxlines argument to readlines() In-Reply-To: References: <20110930180019.466ebbdb@pitrou.net> Message-ID: > f = open(...)itertools.islice(f, maxlines) Formatting fail... f = open(...) itertools.islice(f, maxlines) - John -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Tue Oct 4 19:59:18 2011 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 4 Oct 2011 19:59:18 +0200 Subject: [Python-ideas] Add peekline(), peeklines(n) and optional maxlines argument to readlines() In-Reply-To: References: Message-ID: On Fri, Sep 30, 2011 at 13:28, Nick Coghlan wrote: > > On Fri, Sep 30, 2011 at 5:42 AM, Giampaolo Rodol? wrote: > > ...or 2 in case you're not at the beginning of the file. > > before = f.tell() > > f.peeklines(10) > > f.seek(before) > > A context manager to handle the tell()/seek() may be an interesting > and more general purpose idea: > > ? ?# In the io module > ? ?class _TellSeek: > ? ? ? ?def __init__(self, f): > ? ? ? ? ? ?self._f = f > ? ? ? ?def __enter__(self): > ? ? ? ? ? ?self._position = self._f.tell() > ? ? ? ?def __exit__(self, *args): > ? ? ? ? ? ?self._f.seek(self._position) > > ? ?def restore_position(f): > ? ? ? ?return _TellSeek(f) > > ? ?# Usage > ? ?with io.restore_position(f): > ? ? ? ?for i, line in enumerate(f, 1): > ? ? ? ? ? ?# Do stuff > ? ? ? ? ? ?if i == 10: > ? ? ? ? ? ? ? ?break > ? ? ? ? else: > ? ? ? ? ? ?# Oops, didn't get as many lines as we wanted This is useful, and made simpler by contextlib.contextmanager. Actually I just posted this snipped on G+ a couple of weeks ago, since I found it very useful for some stream-massaging code I was writing. And yes, it only makes sense for seekable streams, of course. So I'm -1 on the peeklines request, since it's easily implemented by other means. Eli From ncoghlan at gmail.com Tue Oct 4 20:06:26 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 4 Oct 2011 14:06:26 -0400 Subject: [Python-ideas] Add peekline(), peeklines(n) and optional maxlines argument to readlines() In-Reply-To: References: Message-ID: On Tue, Oct 4, 2011 at 1:59 PM, Eli Bendersky wrote: > This is useful, and made simpler by contextlib.contextmanager. Yeah, the only reason I wrote it out by hand is that if it *did* go into the io module, we wouldn't want to depend on contextlib for it. However, as Antoine pointed out, it only works for seekable streams and probably isn't general purpose enough to actually be included in the io module. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From steve at pearwood.info Wed Oct 5 01:13:40 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 05 Oct 2011 10:13:40 +1100 Subject: [Python-ideas] Default return values to int and float In-Reply-To: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> Message-ID: <4E8B9324.5040009@pearwood.info> Carl Matthew Johnson wrote: > This reminds me of the string.index vs. string.find discussion we had > a while back. In basically any situation where an exception can be > raised, it's sometimes nice to return a None-like value and sometimes > nice to have an out-of-band exception. I have a certain amount of > admiration for the pattern in Go of returning (value, error) from > most functions that might have an error, but for Python as it is > today, there's no One Obvious Way to Do It yet, and there's probably > none forthcoming. I beg to differ. Raising an exception *is* the One Obvious Way in Python. But OOW does not mean "Only One Way", and the existence of raise doesn't mean that there can't be a Second Not-So-Obvious Way, such as returning a "not found" value. However, returning None as re.match does is better than returning -1 as str.find does, as -1 can be mistaken for a valid result but None can't be. -- Steven From greg.ewing at canterbury.ac.nz Wed Oct 5 01:15:44 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 05 Oct 2011 12:15:44 +1300 Subject: [Python-ideas] Changing semantics of for-loop variable: Alternative version In-Reply-To: References: <4E802202.4080009@canterbury.ac.nz> <4E8159AB.2030506@canterbury.ac.nz> <465D00FC-D144-4606-AF90-2F5014C4ED3A@gmail.com> <4E817549.7080107@canterbury.ac.nz> <4E84E3C7.2030102@canterbury.ac.nz> Message-ID: <4E8B93A0.5040302@canterbury.ac.nz> Ronald Oussoren wrote: > What worries me with this proposal is that it only affects > the loop variable, and not other variables. This makes it easy > to introduce subtle bugs when you forget this. > > for val in sequence: > other = lookup(val) > result.append(lambda val=val, other=other): doit(val, other)) This is a good point, and I have an alternative version of the idea that addresses it. Instead of the cell-replacement behaviour being automatic, we provide a way to explicity request it, such as for new i in stuff: ... The advantage is that can be applied to anything that binds a name, e.g. for x in stuff: new i = 2 * x def f(): print i store_away(f) The disadvantage is that you need to be aware of it and remember to use it when required. However that's no worse than the status quo, and I think it would provide a nicer solution than the default argument hack or any of its proposed variations. -- Greg From greg.ewing at canterbury.ac.nz Wed Oct 5 01:20:35 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 05 Oct 2011 12:20:35 +1300 Subject: [Python-ideas] Changing semantics of for-loop variable In-Reply-To: References: <4E802202.4080009@canterbury.ac.nz> <4E8159AB.2030506@canterbury.ac.nz> <465D00FC-D144-4606-AF90-2F5014C4ED3A@gmail.com> <4E817549.7080107@canterbury.ac.nz> <4E84E3C7.2030102@canterbury.ac.nz> Message-ID: <4E8B94C3.6000708@canterbury.ac.nz> Jason Orendorff wrote: > Also: What if the loop variable is also used elsewhere in the function? > > def f(seq): > i = None > for i in seq: > pass > return i > > Does the loop variable get a cell in this case? No, because i is not referenced from an inner function. There is no change from the current semantics in this case. If it *were* referenced from an inner function, then cell replacement would occur, notwithstanding the fact that it's used elsewhere in the function. -- Greg From steve at pearwood.info Wed Oct 5 01:38:25 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 05 Oct 2011 10:38:25 +1100 Subject: [Python-ideas] __iter__ implies __contains__? In-Reply-To: <4E8AA808.9090308@canterbury.ac.nz> References: <20111001201320.567074bb@pitrou.net> <01DE1333-0FBD-472F-AEB3-5EF07C12A9EC@masklinn.net> <4E87BAD9.2070501@pearwood.info> <4D758E1A-9C84-4ABA-AEE7-6A9D7FA359B3@masklinn.net> <4E88DF0E.9080207@canterbury.ac.nz> <4E8962A6.5020708@canterbury.ac.nz> <4E8AA808.9090308@canterbury.ac.nz> Message-ID: <4E8B98F1.2000906@pearwood.info> Greg Ewing wrote: > Raymond Hettinger wrote: > >> Really? Passing around iterators is a basic design pattern >> http://en.wikipedia.org/wiki/Iterator_pattern for lots of languages. > > I'm not suggesting that we stop using iterators altogether, > only that reiterables are often preferable when there's > a choice. Passing a reiterable to a piece of third-party > library code is safer and more future-proof than passing an > iterator, because it makes less assumptions about what will > be done to it. I don't think it is up to the supplier of the data to try to guess what the code will do. After all, there is no limit to what silly things a called function *might* try. Why single out "iterate over an iterator twice" for special consideration? To put it another way, if a function is advertised as working on iterables (either implicitly or explicitly), I would have no compunction about passing a finite iterator and expecting it to work. If it fails to work, the bug is in the function, not my code for using an iterator. (Infinite iterators are a special case... I wouldn't expect to be able to use them in general, if for no other reason than most algorithms expect to terminate.) -- Steven From greg.ewing at canterbury.ac.nz Wed Oct 5 02:27:07 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 05 Oct 2011 13:27:07 +1300 Subject: [Python-ideas] PEP 335: Another use case Message-ID: <4E8BA45B.80906@canterbury.ac.nz> I've received the following comments concerning another potential use case for PEP 335. -------- Original Message -------- Subject: PEP 335 and NumPy NA Date: Tue, 4 Oct 2011 15:42:50 -0700 From: Mark Wiebe To: Gregory Ewing Hi Greg, I took a glance at the python-ideas list, and saw that you're proposing an update to PEP 335. I recently did a short internship at Enthought where they asked me to look into the "missing data" problem for NumPy, and I believe the work from that provides further motivation for the PEP. Basically, the approach to allow good handling of missing data is emulating the R programming language by introducing a new value called NA, similar to the built-in None. For an NA with a boolean type, you get a three-valued logic, as described in http://en.wikipedia.org/wiki/Three-valued_logic#Kleene_logic. In the NA I added to NumPy, this truth table cannot be satisfied because of the issue PEP 335 addresses: >>> import numpy as np >>> if True or np.NA: print "yes" ... yes >>> if np.NA or True: print "yes" ... Traceback (most recent call last): File "", line 1, in ValueError: numpy.NA represents an unknown missing value, so its truth value cannot be determined This can be worked around by using the bitwise operators, as mentioned in the PEP for the array case: >>> if True | np.NA: print "yes" ... yes >>> if np.NA | True: print "yes" ... yes Here are the documents with more information: https://github.com/numpy/numpy/blob/master/doc/source/reference/arrays.maskna.rst https://github.com/numpy/numpy/blob/master/doc/neps/missing-data.rst Cheers, Mark From guido at python.org Wed Oct 5 04:21:21 2011 From: guido at python.org (Guido van Rossum) Date: Tue, 4 Oct 2011 19:21:21 -0700 Subject: [Python-ideas] Default return values to int and float In-Reply-To: <4E8B9324.5040009@pearwood.info> References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> Message-ID: On Tue, Oct 4, 2011 at 4:13 PM, Steven D'Aprano wrote: > Carl Matthew Johnson wrote: > >> This reminds me of the string.index vs. string.find discussion we had >> a while back. In basically any situation where an exception can be >> raised, it's sometimes nice to return a None-like value and sometimes >> nice to have an out-of-band exception. I have a certain amount of >> admiration for the pattern in Go of returning (value, error) from >> most functions that might have an error, but for Python as it is >> today, there's no One Obvious Way to Do It yet, and there's probably >> none forthcoming. > > I beg to differ. Raising an exception *is* the One Obvious Way in Python. > But OOW does not mean "Only One Way", and the existence of raise doesn't > mean that there can't be a Second Not-So-Obvious Way, such as returning a > "not found" value. > > However, returning None as re.match does is better than returning -1 as > str.find does, as -1 can be mistaken for a valid result but None can't be. What works for re.match doesn't work for str.find. With re.match, the result when cast to bool is true when there's a match and false when there isn't. That's elegant. But with str.find, 0 is a legitimate result, so if we were to return None there'd be *two* outcomes mapping to false: no match, or a match at the start of the string, which is no good. Hence the -1: the intent was that people should write "if s.find(x) >= 0" -- but clearly that didn't work out either, it's too easy to forget the ">= 0" part. We also have str.index which raised an exception, but people dislike writing try/except blocks. We now have "if x in s" for situations where you don't care where the match occurred, but unfortunately if you need to check whether *and* where a match occurred, your options are str.find (easy to forget the ">= 0" part), str.index (cumbersome to write the try/except block), or "if x in s: i = s.index(x); ..." which looks compact but does a redundant second linear search. (It is also too attractive since it can be used without introducing a variable.) Other ideas: returning some more structured object than an integer (like re.match does) feels like overkill, and returning an (index, success) tuple is begging for lots of mysterious occurrences of [0] or [1]. I'm out of ideas here. But of all these, str.find is probably still the worst -- I've flagged bugs caused by it too many times to count. -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Wed Oct 5 04:48:03 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 4 Oct 2011 22:48:03 -0400 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> Message-ID: On Tue, Oct 4, 2011 at 10:21 PM, Guido van Rossum wrote: > I'm out of ideas here. But of all these, str.find is probably still > the worst -- I've flagged bugs caused by it too many times to count. You're not the only one - there's a reason str.find/index discussions always seem to devolve into attempts to find tolerable expression syntaxes for converting a particular exception type into a default value for the expression :P Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From bruce at leapyear.org Wed Oct 5 04:55:00 2011 From: bruce at leapyear.org (Bruce Leban) Date: Tue, 4 Oct 2011 19:55:00 -0700 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: Message-ID: On Mon, Oct 3, 2011 at 11:58 PM, David Townshend wrote: > > A generalised try_convert method would be useful, but I think I would > rather see a one-line version of the try statements, perhaps something like > this: > > x = try float('some text') else 42 if ValueError > for parallelism with if/else operator I'd like float('some text') except ValueError then 42 which is equivalent to calling: def f(): try: return float('some text') except ValueError return 42 For example, float(foo) except ValueError then None if foo else 0 or equivalently: float(foo) if foo else 0 except ValueError then None Of course this requires a new keyword so the chances of this being added are slim. --- Bruce w00t! Gruyere security codelab graduated from Google Labs! http://j.mp/googlelabs-gruyere Not to late to give it a 5-star rating if you like it. :-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron3200 at gmail.com Wed Oct 5 07:30:13 2011 From: ron3200 at gmail.com (Ron Adam) Date: Wed, 05 Oct 2011 00:30:13 -0500 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> Message-ID: <1317792613.27651.114.camel@Gutsy> On Tue, 2011-10-04 at 19:21 -0700, Guido van Rossum wrote: > On Tue, Oct 4, 2011 at 4:13 PM, Steven D'Aprano wrote: > > Carl Matthew Johnson wrote: > > > >> This reminds me of the string.index vs. string.find discussion we had > >> a while back. In basically any situation where an exception can be > >> raised, it's sometimes nice to return a None-like value and sometimes > >> nice to have an out-of-band exception. I have a certain amount of > >> admiration for the pattern in Go of returning (value, error) from > >> most functions that might have an error, but for Python as it is > >> today, there's no One Obvious Way to Do It yet, and there's probably > >> none forthcoming. > > > > I beg to differ. Raising an exception *is* the One Obvious Way in Python. > > But OOW does not mean "Only One Way", and the existence of raise doesn't > > mean that there can't be a Second Not-So-Obvious Way, such as returning a > > "not found" value. > > > > However, returning None as re.match does is better than returning -1 as > > str.find does, as -1 can be mistaken for a valid result but None can't be. > > What works for re.match doesn't work for str.find. With re.match, the > result when cast to bool is true when there's a match and false when > there isn't. That's elegant. > > But with str.find, 0 is a legitimate result, so if we were to return > None there'd be *two* outcomes mapping to false: no match, or a match > at the start of the string, which is no good. Hence the -1: the intent > was that people should write "if s.find(x) >= 0" -- but clearly that > didn't work out either, it's too easy to forget the ">= 0" part. We > also have str.index which raised an exception, but people dislike > writing try/except blocks. We now have "if x in s" for situations > where you don't care where the match occurred, but unfortunately if > you need to check whether *and* where a match occurred, your options > are str.find (easy to forget the ">= 0" part), str.index (cumbersome > to write the try/except block), or "if x in s: i = s.index(x); ..." > which looks compact but does a redundant second linear search. (It is > also too attractive since it can be used without introducing a > variable.) > > Other ideas: returning some more structured object than an integer > (like re.match does) feels like overkill, and returning an (index, > success) tuple is begging for lots of mysterious occurrences of [0] or > [1]. > > I'm out of ideas here. But of all these, str.find is probably still > the worst -- I've flagged bugs caused by it too many times to count. There is also the newer partition and rpartition methods, which I tend to forget about. I really don't like the '-1' for a not found case. They just get in the way. If len(s) was the not found case, you get a value that can be used in a slice without first checking the index, or catching an exception. >>> s[len(s):] '' Lets say we didn't have a split method and needed to write one. If s.find returned len(s) as the not found... def split(s, x): result = [] start = 0 while start < len(s): i = s.find(x, start) result.append(s[start:i]) # No check needed here. start = i + len(x) return result Of course you could you this same pattern for other things. cheers, Ron From ethan at stoneleaf.us Wed Oct 5 07:38:59 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 04 Oct 2011 22:38:59 -0700 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> Message-ID: <4E8BED73.4030907@stoneleaf.us> Guido van Rossum wrote: > On Tue, Oct 4, 2011 at 4:13 PM, Steven D'Aprano wrote: >> However, returning None as re.match does is better than returning -1 as >> str.find does, as -1 can be mistaken for a valid result but None can't be. > > What works for re.match doesn't work for str.find. With re.match, the > result when cast to bool is true when there's a match and false when > there isn't. That's elegant. [snip] > But with str.find, 0 is a legitimate result, so if we were to return > None there'd be *two* outcomes mapping to false: no match, or a match > at the start of the string, which is no good. [snip] > Other ideas: returning some more structured object than an integer > (like re.match does) feels like overkill [snip] > I'm out of ideas here. But of all these, str.find is probably still > the worst -- I've flagged bugs caused by it too many times to count. What's correct code worth? My contributions to other Open Source projects is so minor as to not register, but the first bug report/patch I ever submitted was a str.find issue. A structured object that behaved like an int /except/ for its boolean checks might do the trick here. Something like: class FindResult(int): def __bool__(self): return self != -1 Code that checks for -1 (like it should) will keep working, and code that doesn't will start working. ~Ethan~ From greg.ewing at canterbury.ac.nz Wed Oct 5 08:08:47 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 05 Oct 2011 19:08:47 +1300 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> Message-ID: <4E8BF46F.2020404@canterbury.ac.nz> Guido van Rossum wrote: > I'm out of ideas here. But of all these, str.find is probably still > the worst -- I've flagged bugs caused by it too many times to count. Could a with-statement be used here somehow? with finding(x, s) as i: ... -- Greg From anacrolix at gmail.com Wed Oct 5 09:02:05 2011 From: anacrolix at gmail.com (Matt Joiner) Date: Wed, 5 Oct 2011 18:02:05 +1100 Subject: [Python-ideas] Default return values to int and float In-Reply-To: <1317792613.27651.114.camel@Gutsy> References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> <1317792613.27651.114.camel@Gutsy> Message-ID: -1 to this idea unless it gives significant performance boosts for one or more of the python implementations On Oct 5, 2011 4:30 PM, "Ron Adam" wrote: > On Tue, 2011-10-04 at 19:21 -0700, Guido van Rossum wrote: >> On Tue, Oct 4, 2011 at 4:13 PM, Steven D'Aprano wrote: >> > Carl Matthew Johnson wrote: >> > >> >> This reminds me of the string.index vs. string.find discussion we had >> >> a while back. In basically any situation where an exception can be >> >> raised, it's sometimes nice to return a None-like value and sometimes >> >> nice to have an out-of-band exception. I have a certain amount of >> >> admiration for the pattern in Go of returning (value, error) from >> >> most functions that might have an error, but for Python as it is >> >> today, there's no One Obvious Way to Do It yet, and there's probably >> >> none forthcoming. >> > >> > I beg to differ. Raising an exception *is* the One Obvious Way in Python. >> > But OOW does not mean "Only One Way", and the existence of raise doesn't >> > mean that there can't be a Second Not-So-Obvious Way, such as returning a >> > "not found" value. >> > >> > However, returning None as re.match does is better than returning -1 as >> > str.find does, as -1 can be mistaken for a valid result but None can't be. >> >> What works for re.match doesn't work for str.find. With re.match, the >> result when cast to bool is true when there's a match and false when >> there isn't. That's elegant. >> >> But with str.find, 0 is a legitimate result, so if we were to return >> None there'd be *two* outcomes mapping to false: no match, or a match >> at the start of the string, which is no good. Hence the -1: the intent >> was that people should write "if s.find(x) >= 0" -- but clearly that >> didn't work out either, it's too easy to forget the ">= 0" part. We >> also have str.index which raised an exception, but people dislike >> writing try/except blocks. We now have "if x in s" for situations >> where you don't care where the match occurred, but unfortunately if >> you need to check whether *and* where a match occurred, your options >> are str.find (easy to forget the ">= 0" part), str.index (cumbersome >> to write the try/except block), or "if x in s: i = s.index(x); ..." >> which looks compact but does a redundant second linear search. (It is >> also too attractive since it can be used without introducing a >> variable.) >> >> Other ideas: returning some more structured object than an integer >> (like re.match does) feels like overkill, and returning an (index, >> success) tuple is begging for lots of mysterious occurrences of [0] or >> [1]. >> >> I'm out of ideas here. But of all these, str.find is probably still >> the worst -- I've flagged bugs caused by it too many times to count. > > There is also the newer partition and rpartition methods, which I tend > to forget about. > > > I really don't like the '-1' for a not found case. They just get in the > way. > > > If len(s) was the not found case, you get a value that can be used in a > slice without first checking the index, or catching an exception. > >>>> s[len(s):] > '' > > > Lets say we didn't have a split method and needed to write one. > > If s.find returned len(s) as the not found... > > def split(s, x): > result = [] > start = 0 > while start < len(s): > i = s.find(x, start) > result.append(s[start:i]) # No check needed here. > start = i + len(x) > return result > > Of course you could you this same pattern for other things. > > cheers, > Ron > > > > > > > > > > > > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron3200 at gmail.com Wed Oct 5 14:42:51 2011 From: ron3200 at gmail.com (Ron Adam) Date: Wed, 05 Oct 2011 07:42:51 -0500 Subject: [Python-ideas] Default return values to int and float In-Reply-To: <4E8BF46F.2020404@canterbury.ac.nz> References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> <4E8BF46F.2020404@canterbury.ac.nz> Message-ID: <1317818571.11419.1.camel@Gutsy> On Wed, 2011-10-05 at 19:08 +1300, Greg Ewing wrote: > Guido van Rossum wrote: > > > I'm out of ideas here. But of all these, str.find is probably still > > the worst -- I've flagged bugs caused by it too many times to count. > > Could a with-statement be used here somehow? > > with finding(x, s) as i: > ... Or an iterator. for i in finding(x, s): ... From gerald.britton at gmail.com Wed Oct 5 15:13:20 2011 From: gerald.britton at gmail.com (Gerald Britton) Date: Wed, 5 Oct 2011 09:13:20 -0400 Subject: [Python-ideas] PEP 335: Another use case Message-ID: >I've received the following comments concerning another>potential use case for PEP 335.>>-------- Original Message -------->Subject: PEP 335 and NumPy NA>Date: Tue, 4 Oct 2011 15:42:50 -0700>From: Mark Wiebe >To: Gregory Ewing >>Hi Greg,>>I took a glance at the python-ideas list, and saw that you're proposing>an update to PEP 335. I recently did a short internship at Enthought>where they asked me to look into the "missing data" problem for NumPy,>and I believe the work from that provides further motivation for the>PEP. Basically, the approach to allow good handling of missing data is>emulating the R programming language by introducing a new value called>NA, similar to the built-in None. For an NA with a boolean type, you get>a three-valued logic, as described>in http://en.wikipedia.org/wiki/Three-valued_logic#Kleene_logic. In the>NA I added to NumPy, this truth table cannot be satisfied because of the>issue PEP 335 addresses: Interesting! Kinda like SQL, I think, which has True, False and Null (similar to NA in R). Weird to think SQL "solved" this problem 40 years ago! -- Gerald Britton From solipsis at pitrou.net Wed Oct 5 15:19:25 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 5 Oct 2011 15:19:25 +0200 Subject: [Python-ideas] PEP 335: Another use case References: Message-ID: <20111005151925.3b934679@pitrou.net> On Wed, 5 Oct 2011 09:13:20 -0400 Gerald Britton wrote: > >I've received the following comments concerning another>potential use case for PEP 335.>>-------- Original Message -------->Subject: PEP 335 and NumPy NA>Date: Tue, 4 Oct 2011 15:42:50 -0700>From: Mark Wiebe >To: Gregory Ewing >>Hi Greg,>>I took a glance at the python-ideas list, and saw that you're proposing>an update to PEP 335. I recently did a short internship at Enthought>where they asked me to look into the "missing data" problem for NumPy,>and I believe the work from that provides further motivation for the>PEP. Basically, the approach to allow good handling of missing data is>emulating the R programming language by introducing a new value called>NA, similar to the built-in None. For an NA with a boolean type, you get>a three > -valued logic, as described>in http://en.wikipedia.org/wiki/Three-valued_logic#Kleene_logic. In the>NA I added to NumPy, this truth table cannot be satisfied because of the>issue PEP 335 add > resses: > > Interesting! Kinda like SQL, I think, which has True, False and Null > (similar to NA in R). > Weird to think SQL "solved" this problem 40 years ago! I wouldn't agree SQL's NULL has solved anything. The rules around NULL calculations are a PITA to use (or work around) and they don't really make sense in any situation I've encountered. Python's None is much saner. I think that, if you want some alternative logic system, you generally use some rule engine of some sort. Mixing two different logics in the same programming language sounds like a recipe for confusion, because the semantics of control flow become context-dependant. Regards Antoine. From ethan at stoneleaf.us Wed Oct 5 16:31:36 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 05 Oct 2011 07:31:36 -0700 Subject: [Python-ideas] Default return values to int and float In-Reply-To: <1317818571.11419.1.camel@Gutsy> References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> <4E8BF46F.2020404@canterbury.ac.nz> <1317818571.11419.1.camel@Gutsy> Message-ID: <4E8C6A48.9000507@stoneleaf.us> Ron Adam wrote: > On Wed, 2011-10-05 at 19:08 +1300, Greg Ewing wrote: >> Guido van Rossum wrote: >> >>> I'm out of ideas here. But of all these, str.find is probably still >>> the worst -- I've flagged bugs caused by it too many times to count. >> Could a with-statement be used here somehow? >> >> with finding(x, s) as i: >> ... > > Or an iterator. > > for i in finding(x, s): > ... How would the case of not found be handled in either of these proposals? with finding(x, s) as i: ... if not i: # same problem as str.find, unless i is not a simple int for i in finding(x, s): ... else: # unless for loop has a break, this will happen... # not a problem until you want more than just the first # occurrence of s in x ~Ethan~ From ethan at stoneleaf.us Wed Oct 5 17:06:07 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 05 Oct 2011 08:06:07 -0700 Subject: [Python-ideas] Default return values to int and float In-Reply-To: <1317792613.27651.114.camel@Gutsy> References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> <1317792613.27651.114.camel@Gutsy> Message-ID: <4E8C725F.10209@stoneleaf.us> Ron Adam wrote: > I really don't like the '-1' for a not found case. They just get in the > way. > > If len(s) was the not found case, you get a value that can be used in a > slice without first checking the index, or catching an exception. So every time we want to know if s.find() failed, we have to compare to len(s)? No thanks. ~Ethan~ From ncoghlan at gmail.com Wed Oct 5 19:25:52 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 5 Oct 2011 13:25:52 -0400 Subject: [Python-ideas] Default return values to int and float In-Reply-To: <4E8C6A48.9000507@stoneleaf.us> References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> <4E8BF46F.2020404@canterbury.ac.nz> <1317818571.11419.1.camel@Gutsy> <4E8C6A48.9000507@stoneleaf.us> Message-ID: On Oct 5, 2011 10:32 AM, "Ethan Furman" wrote: > > Ron Adam wrote: >> >> On Wed, 2011-10-05 at 19:08 +1300, Greg Ewing wrote: >>> >>> Guido van Rossum wrote: >>> >>>> I'm out of ideas here. But of all these, str.find is probably still >>>> the worst -- I've flagged bugs caused by it too many times to count. >>> >>> Could a with-statement be used here somehow? >>> >>> with finding(x, s) as i: >>> ... >> >> >> Or an iterator. >> >> for i in finding(x, s): >> ... > > > How would the case of not found be handled in either of these proposals? By never executing the body of the loop. It's still a thoroughly unnatural API for the 0 or 1 case, though. -- Nick Coghlan (via Gmail on Android, so likely to be more terse than usual) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Wed Oct 5 19:56:58 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 05 Oct 2011 10:56:58 -0700 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> <4E8BF46F.2020404@canterbury.ac.nz> <1317818571.11419.1.camel@Gutsy> <4E8C6A48.9000507@stoneleaf.us> Message-ID: <4E8C9A6A.6080203@stoneleaf.us> Nick Coghlan wrote: > On Oct 5, 2011 10:32 AM, "Ethan Furman" wrote: >> Ron Adam wrote: >>> On Wed, 2011-10-05 at 19:08 +1300, Greg Ewing wrote: >>>> Guido van Rossum wrote: >>>> >>>>> I'm out of ideas here. But of all these, str.find is probably still >>>>> the worst -- I've flagged bugs caused by it too many times to count. >>>> >>>> Could a with-statement be used here somehow? >>>> >>>> with finding(x, s) as i: >>>> ... >>> >>> >>> Or an iterator. >>> >>> for i in finding(x, s): >>> ... >> >> >> How would the case of not found be handled in either of these proposals? > > By never executing the body of the loop. It's still a thoroughly > unnatural API for the 0 or 1 case, though. Let me rephrase: found = "I don't want to get into the cart!".find('z') if found >= 0: # do stuff if found else: # do stuff if not found or found = "I don't want to get into the cart!".find('n') while found >= 0: # do stuff if found found = "I don't want to get into the cart!".find('n', found+1) if found == -1: break else: print('false branch') # do stuff if not found How would we reliably get the false branch with the above proposals? ~Ethan~ From ron3200 at gmail.com Wed Oct 5 19:58:41 2011 From: ron3200 at gmail.com (Ron Adam) Date: Wed, 05 Oct 2011 12:58:41 -0500 Subject: [Python-ideas] Default return values to int and float In-Reply-To: <4E8C6A48.9000507@stoneleaf.us> References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> <4E8BF46F.2020404@canterbury.ac.nz> <1317818571.11419.1.camel@Gutsy> <4E8C6A48.9000507@stoneleaf.us> Message-ID: <1317837521.19908.44.camel@Gutsy> On Wed, 2011-10-05 at 07:31 -0700, Ethan Furman wrote: > Ron Adam wrote: > > On Wed, 2011-10-05 at 19:08 +1300, Greg Ewing wrote: > >> Guido van Rossum wrote: > >> > >>> I'm out of ideas here. But of all these, str.find is probably still > >>> the worst -- I've flagged bugs caused by it too many times to count. > >> Could a with-statement be used here somehow? > >> > >> with finding(x, s) as i: > >> ... > > > > Or an iterator. > > > > for i in finding(x, s): > > ... > > How would the case of not found be handled in either of these proposals? > > with finding(x, s) as i: > ... > if not i: # same problem as str.find, unless i is not a simple int I'll let Nick answer this one because I'm not sure about it. > for i in finding(x, s): > ... > else: > # unless for loop has a break, this will happen... > # not a problem until you want more than just the first > # occurrence of s in x for i in finding(x, s): if i > 25: break else: return result raise(ValueError("string 's' had an 'x' after position 25") Cheers, Ron From ethan at stoneleaf.us Wed Oct 5 20:07:40 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 05 Oct 2011 11:07:40 -0700 Subject: [Python-ideas] Default return values to int and float In-Reply-To: <1317837521.19908.44.camel@Gutsy> References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> <4E8BF46F.2020404@canterbury.ac.nz> <1317818571.11419.1.camel@Gutsy> <4E8C6A48.9000507@stoneleaf.us> <1317837521.19908.44.camel@Gutsy> Message-ID: <4E8C9CEC.7040002@stoneleaf.us> Ron Adam wrote: > On Wed, 2011-10-05 at 07:31 -0700, Ethan Furman wrote: >> for i in finding(x, s): >> ... >> else: >> # unless for loop has a break, this will happen... >> # not a problem until you want more than just the first >> # occurrence of s in x > > > for i in finding(x, s): > if i > 25: > break > > else: > return result > raise(ValueError("string 's' had an 'x' after position 25") And how did you decide on the magical number 25? ~Ethan~ From ron3200 at gmail.com Wed Oct 5 20:33:47 2011 From: ron3200 at gmail.com (Ron Adam) Date: Wed, 05 Oct 2011 13:33:47 -0500 Subject: [Python-ideas] Default return values to int and float In-Reply-To: <4E8C725F.10209@stoneleaf.us> References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> <1317792613.27651.114.camel@Gutsy> <4E8C725F.10209@stoneleaf.us> Message-ID: <1317839627.19908.76.camel@Gutsy> On Wed, 2011-10-05 at 08:06 -0700, Ethan Furman wrote: > Ron Adam wrote: > > I really don't like the '-1' for a not found case. They just get in the > > way. > > > > If len(s) was the not found case, you get a value that can be used in a > > slice without first checking the index, or catching an exception. > > So every time we want to know if s.find() failed, we have to compare to > len(s)? I think probably None would have been better than -1. At least then you will get an error if you try to use it as an index. The problem with len(s) as a failure, is when you consider rfind(). It should return 1 before the beginning, which coincidentally it does, but you still can't use it as a slice index because it will give you 1 from the end instead. So until we find a another way to do negative sequence indexing. It won't quite work as nice as it should. Cheers, Ron From ron3200 at gmail.com Wed Oct 5 20:35:43 2011 From: ron3200 at gmail.com (Ron Adam) Date: Wed, 05 Oct 2011 13:35:43 -0500 Subject: [Python-ideas] Default return values to int and float In-Reply-To: <4E8C9CEC.7040002@stoneleaf.us> References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> <4E8BF46F.2020404@canterbury.ac.nz> <1317818571.11419.1.camel@Gutsy> <4E8C6A48.9000507@stoneleaf.us> <1317837521.19908.44.camel@Gutsy> <4E8C9CEC.7040002@stoneleaf.us> Message-ID: <1317839743.19908.77.camel@Gutsy> On Wed, 2011-10-05 at 11:07 -0700, Ethan Furman wrote: > Ron Adam wrote: > > On Wed, 2011-10-05 at 07:31 -0700, Ethan Furman wrote: > >> for i in finding(x, s): > >> ... > >> else: > >> # unless for loop has a break, this will happen... > >> # not a problem until you want more than just the first > >> # occurrence of s in x > > > > > > for i in finding(x, s): > > if i > 25: > > break > > > > else: > > return result > > raise(ValueError("string 's' had an 'x' after position 25") > > And how did you decide on the magical number 25? I knew I should have used 42. ;-) Cheers, Ron From python at mrabarnett.plus.com Wed Oct 5 20:42:33 2011 From: python at mrabarnett.plus.com (MRAB) Date: Wed, 05 Oct 2011 19:42:33 +0100 Subject: [Python-ideas] Default return values to int and float In-Reply-To: <4E8C9A6A.6080203@stoneleaf.us> References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> <4E8BF46F.2020404@canterbury.ac.nz> <1317818571.11419.1.camel@Gutsy> <4E8C6A48.9000507@stoneleaf.us> <4E8C9A6A.6080203@stoneleaf.us> Message-ID: <4E8CA518.4030101@mrabarnett.plus.com> On 05/10/2011 18:56, Ethan Furman wrote: > Nick Coghlan wrote: >> On Oct 5, 2011 10:32 AM, "Ethan Furman" wrote: >>> Ron Adam wrote: >>>> On Wed, 2011-10-05 at 19:08 +1300, Greg Ewing wrote: >>>>> Guido van Rossum wrote: >>>>> >>>>>> I'm out of ideas here. But of all these, str.find is probably still >>>>>> the worst -- I've flagged bugs caused by it too many times to count. >>>>> >>>>> Could a with-statement be used here somehow? >>>>> >>>>> with finding(x, s) as i: >>>>> ... >>>> >>>> >>>> Or an iterator. >>>> >>>> for i in finding(x, s): >>>> ... >>> >>> >>> How would the case of not found be handled in either of these >>> proposals? >> >> By never executing the body of the loop. It's still a thoroughly >> unnatural API for the 0 or 1 case, though. > > > Let me rephrase: > > found = "I don't want to get into the cart!".find('z') > if found >= 0: > # do stuff if found > else: > # do stuff if not found > > or > > found = "I don't want to get into the cart!".find('n') > while found >= 0: > # do stuff if found > found = "I don't want to get into the cart!".find('n', found+1) > if found == -1: > break > else: > print('false branch') > # do stuff if not found > > How would we reliably get the false branch with the above proposals? > We've had the discussion before about how to handle the case when the body of the loop isn't executed at all. I had the thought that a possible syntax could be: found = "I don't want to get into the cart!".find('n') while found >= 0: # do stuff if found found = "I don't want to get into the cart!".find('n', found+1) or: print('false branch') # do stuff if not found but I think I'll leave it there. From python at mrabarnett.plus.com Wed Oct 5 20:48:05 2011 From: python at mrabarnett.plus.com (MRAB) Date: Wed, 05 Oct 2011 19:48:05 +0100 Subject: [Python-ideas] Default return values to int and float In-Reply-To: <1317839627.19908.76.camel@Gutsy> References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> <1317792613.27651.114.camel@Gutsy> <4E8C725F.10209@stoneleaf.us> <1317839627.19908.76.camel@Gutsy> Message-ID: <4E8CA665.6050604@mrabarnett.plus.com> On 05/10/2011 19:33, Ron Adam wrote: > On Wed, 2011-10-05 at 08:06 -0700, Ethan Furman wrote: >> Ron Adam wrote: >>> I really don't like the '-1' for a not found case. They just get in the >>> way. >>> >>> If len(s) was the not found case, you get a value that can be used in a >>> slice without first checking the index, or catching an exception. >> >> So every time we want to know if s.find() failed, we have to compare to >> len(s)? > > I think probably None would have been better than -1. At least then you > will get an error if you try to use it as an index. > None will be rejected as an index, but not as part of a slice: >>> s = "abcdef" >>> s[None : 4] 'abcd' >>> s[4 : None] 'ef' > The problem with len(s) as a failure, is when you consider rfind(). It > should return 1 before the beginning, which coincidentally it does, but > you still can't use it as a slice index because it will give you 1 from > the end instead. > > So until we find a another way to do negative sequence indexing. It > won't quite work as nice as it should. > From tjreedy at udel.edu Wed Oct 5 23:17:45 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 05 Oct 2011 17:17:45 -0400 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> Message-ID: On 10/4/2011 10:21 PM, Guido van Rossum wrote: > But with str.find, 0 is a legitimate result, so if we were to return > None there'd be *two* outcomes mapping to false: no match, or a match > at the start of the string, which is no good. People would have to test that the result 'is None' or 'is not None'. That is no worse than testing '== -1' or '>= 0'. I claim it is better because continuing to compute with None as if it were a number will more likely quickly raise an error, whereas doing so with a '-1' that looks like a legitimate string position (the last), but does not really mean that, might never raise an error but lead to erroneous output. (I said 'more likely' because None is valid in slicings, same as -1.) Example: define char_before(s,c) as returning the character before the first occurance of c in s. Ignoring the s.startswith(c) case: >>> s='abcde' >>> s[s.find('e')-1] 'd' # Great, it works >>> s[s.find('f')-1] 'd' # Whoops, not so great. s[None] fails, as it should. You usually try to avoid such easy bug bait. I cannot think of any other built-in function that returns such a valid but invalid result. > Hence the -1: the intent > was that people should write "if s.find(x)>= 0" -- but clearly that > didn't work out either, it's too easy to forget the ">= 0" part. As easy or easier than forgetting '== None' > We also have str.index which raised an exception, but people dislike > writing try/except blocks. Given that try/except blocks are routinely used for flow control in Python, and that some experts even advocate using them over if/else (leap first), I am tempted to ask why such people are using Python. I am curious, though, why this exception is more objectionable than all the others -- and why you apparently give such objections for this function more weight than for others. One could justify out-of-range IndexError on the basis that an in-range indexing could return any object, including None, so that the signal *must* not be a normal return (even of an exception object). However, Python comes with numerous, probably 100s of functions with restricted output ranges that raise exceptions (TypeError, ValueError, AttributeError, etc) instead of returning, for instance, None. For example, consider int('a'): why not None instead of ValueError? One reason is that s[:int('a')] would then return s instead of raising an error. I strongly suspect that if we did not have str.find now, we would not add it, and certainly not in its current form. > I'm out of ideas here. But of all these, str.find is probably still > the worst -- I've flagged bugs caused by it too many times to count. So lets deprecate it for eventual removal, maybe in Py4. -- Terry Jan Reedy From greg.ewing at canterbury.ac.nz Thu Oct 6 01:05:33 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 06 Oct 2011 12:05:33 +1300 Subject: [Python-ideas] PEP 335: Another use case In-Reply-To: <20111005151925.3b934679@pitrou.net> References: <20111005151925.3b934679@pitrou.net> Message-ID: <4E8CE2BD.6070308@canterbury.ac.nz> Antoine Pitrou wrote: > Mixing two different logics in the > same programming language sounds like a recipe for confusion, because > the semantics of control flow become context-dependant. At least you would (or should) get an exception if you try to branch on an indeterminate truth value. That's better than choosing some arbitrary interpretation, which is what happens at the moment when comparing NaNs. -- Greg From zuo at chopin.edu.pl Thu Oct 6 17:47:22 2011 From: zuo at chopin.edu.pl (Jan Kaliszewski) Date: Thu, 6 Oct 2011 17:47:22 +0200 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> Message-ID: <20111006154722.GA3983@chopin.edu.pl> Guido van Rossum dixit (2011-10-04, 19:21): > Other ideas: returning some more structured object than an integer > (like re.match does) feels like overkill, and returning an (index, > success) tuple is begging for lots of mysterious occurrences of [0] or > [1]. A lightweight builtin type whose instances would have `index` attribute might do the job well (together with None as not-found). A naive pure-Python implementation: class Found(object): __slots__ = 'index', def __init__(self, index): self.index = index Example usage: found = s.find('foo') if found: # or more explicit: `if found is not None:` print('foo found at %d' % found.index) else: # found is None print('foo not found') Of course that would be probably a new method, not str.find(), say: str.search(). Then it could be possible to make it a bit more universal, accepting substring tuples (as startswith/endswith already do): Example usage: one_of = 'foo', 'bar', 'baz' found = s.search(one_of) if found: print('%s found at %d' % (found.substring, found.index)) else: print('None of %s found' % one_of) The 4th line could be respelled as: index, substring = found print('%s found at %d' % (substring, index)) A naive implementation of s.search() result type: class Found(object): __slots__ = 'index', 'substring' def __init__(self, index, substring): self.index = index self.substring = substring def __iter__(self): yield self.index yield self.substring Cheers, *j From ron3200 at gmail.com Thu Oct 6 19:42:34 2011 From: ron3200 at gmail.com (Ron Adam) Date: Thu, 06 Oct 2011 12:42:34 -0500 Subject: [Python-ideas] Default return values to int and float In-Reply-To: <20111006154722.GA3983@chopin.edu.pl> References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> <20111006154722.GA3983@chopin.edu.pl> Message-ID: <1317922955.25175.24.camel@Gutsy> On Thu, 2011-10-06 at 17:47 +0200, Jan Kaliszewski wrote: > A lightweight builtin type whose instances would have `index` > attribute might do the job well (together with None as not-found). > > A naive pure-Python implementation: > > class Found(object): > __slots__ = 'index', > def __init__(self, index): > self.index = index > ... > found = s.search(one_of) > ... It seems to me, the methods on the string object should be the lower level fast C methods that allow for efficient higher level functions to be built. A search class may be a good addition to string.py. It already has higher order stuff for composing strings, format and template, but nothing for decomposing strings. Cheers, Ron From jimjjewett at gmail.com Thu Oct 6 22:32:27 2011 From: jimjjewett at gmail.com (Jim Jewett) Date: Thu, 6 Oct 2011 16:32:27 -0400 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> Message-ID: On Wed, Oct 5, 2011 at 5:17 PM, Terry Reedy wrote: > On 10/4/2011 10:21 PM, Guido van Rossum wrote: >> We also have str.index which raised an exception, but people dislike >> writing try/except blocks. > Given that try/except blocks are routinely used for flow control in Python, > and that some experts even advocate using them over if/else (leap first), I > am tempted to ask why such people are using Python. I am curious, > though, why this exception is more objectionable than all the others str.index is a "little" method that it is tempting to use inline, even as part of a comprehension. There isn't a good way to handle exceptions without a full statement. -jJ From zuo at chopin.edu.pl Thu Oct 6 23:41:53 2011 From: zuo at chopin.edu.pl (Jan Kaliszewski) Date: Thu, 6 Oct 2011 23:41:53 +0200 Subject: [Python-ideas] Default return values to int and float In-Reply-To: <1317922955.25175.24.camel@Gutsy> References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> <20111006154722.GA3983@chopin.edu.pl> <1317922955.25175.24.camel@Gutsy> Message-ID: <20111006214153.GB2291@chopin.edu.pl> Ron Adam dixit (2011-10-06, 12:42): > It seems to me, the methods on the string object should be the lower > level fast C methods that allow for efficient higher level functions to > be built. I suggest to make it as a C method and type. The Python implementation I gave is an illustration only ("naive implementation"). Cheers. *j From zuo at chopin.edu.pl Thu Oct 6 23:54:51 2011 From: zuo at chopin.edu.pl (Jan Kaliszewski) Date: Thu, 6 Oct 2011 23:54:51 +0200 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: Message-ID: <20111006215450.GC2291@chopin.edu.pl> Michael Foord dixit (2011-10-03, 14:55): > http://msdn.microsoft.com/en-us/library/994c0zb1.aspx > > The pattern there is equivalent to returning an extra result as well as the > converted value - a boolean indicating whether or not the conversion > succeeded (with the "converted value" being 0.0 where conversion fails). A > Python version might look like: > > success, value = float.parse('thing') > if success: > ... Nice. +1 from me. *j From tjreedy at udel.edu Fri Oct 7 00:04:22 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 06 Oct 2011 18:04:22 -0400 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> Message-ID: On 10/6/2011 4:32 PM, Jim Jewett wrote: > On Wed, Oct 5, 2011 at 5:17 PM, Terry Reedy wrote: >> On 10/4/2011 10:21 PM, Guido van Rossum wrote: >>> We also have str.index which raised an exception, but people dislike >>> writing try/except blocks. > >> Given that try/except blocks are routinely used for flow control in Python, >> and that some experts even advocate using them over if/else (leap first), I >> am tempted to ask why such people are using Python. I am curious, >> though, why this exception is more objectionable than all the others > > str.index is a "little" method that it is tempting to use inline, even > as part of a comprehension. That is an argument *for* raising an exception on error. If one uses .find or .index in a situation where success is certain, then it does not matter what would happen on failure. If failure is possible, and users are tempted to skip checking, then the interpreter should raise a fuss (exception), as it does for other 'little' methods like arithmetic and subscript operations. a.find(b) can raise an AttributeError or TypeError, so returning -1 instead of raising ValueError only partly avoids possible exceptions. > There isn't a good way to handle exceptions without a full statement. Neither is there a good way to handle error return values without a full statement. The try/except form may require fewer lines than the if/else form. -- Terry Jan Reedy From ericsnowcurrently at gmail.com Fri Oct 7 05:28:36 2011 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 6 Oct 2011 21:28:36 -0600 Subject: [Python-ideas] A Protocol for Making an Object Immutable Message-ID: As I was working on some code today, I found myself in a situation where I wanted to "freeze" an object. This topic has come up a few times since I've been following these lists, so I figured it wasn't too crazy an idea. Really quickly I hacked out a simple bit of code that I am using for what I needed, and later I posted it as a recipe on the ActiveState cookbook[1]. The gist of it is a __freeze__/__unfreeze__ protocol, supported respectively by two ABCs and two functions (freeze and unfreeze). I've come to learn that it's a good idea to do some digging in the list archives before posting an idea, and I'm glad I did. Apparently there is a nearly identical PEP (351) to what I was thinking of [2]. And just my luck, it was shot down pretty hard. So I have a couple questions. First, from what I've read so far, the idea of frozen/immutable seems to be tied pretty closely to being hashable. However, this wasn't applicable to the use case that I had. Instead, I just wanted to make sure certain attributes of a class of mine were effectively read-only. So, is being hashable necessarily tied to being immutable? Second, I realize that PEP 351 failed to pass muster, so I'm not holding my breath here. Shoot, I'm not even really proposing we add this protocol. Mostly, I found a simple model that worked for me and seemed more broadly applicable; and I wanted to see if it would be suitable for further research. Now, in light of PEP 351, I'm less convinced that it's so simple. Still, I've seen some ideas come back from the proverbial grave and wanted to see if this one is not quite dead, sir (or if I should avenge it ). -eric [1] http://code.activestate.com/recipes/577895/ [2] http://www.python.org/dev/peps/pep-0351/ From ericsnowcurrently at gmail.com Fri Oct 7 05:48:52 2011 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 6 Oct 2011 21:48:52 -0600 Subject: [Python-ideas] A Protocol for Making an Object Immutable In-Reply-To: References: Message-ID: On Thu, Oct 6, 2011 at 9:28 PM, Eric Snow wrote: > As I was working on some code today, I found myself in a situation > where I wanted to "freeze" an object. ?This topic has come up a few > times since I've been following these lists, so I figured it wasn't > too crazy an idea. > > Really quickly I hacked out a simple bit of code that I am using for > what I needed, and later I posted it as a recipe on the ActiveState > cookbook[1]. ?The gist of it is a __freeze__/__unfreeze__ protocol, > supported respectively by two ABCs and two functions (freeze and > unfreeze). > > I've come to learn that it's a good idea to do some digging in the > list archives before posting an idea, and I'm glad I did. ?Apparently > there is a nearly identical PEP (351) to what I was thinking of [2]. > And just my luck, it was shot down pretty hard. > > So I have a couple questions. ?First, from what I've read so far, the > idea of frozen/immutable seems to be tied pretty closely to being > hashable. ?However, this wasn't applicable to the use case that I had. > ?Instead, I just wanted to make sure certain attributes of a class of > mine were effectively read-only. ?So, is being hashable necessarily > tied to being immutable? > > Second, I realize that PEP 351 failed to pass muster, so I'm not > holding my breath here. ?Shoot, I'm not even really proposing we add > this protocol. ?Mostly, I found a simple model that worked for me and > seemed more broadly applicable; and I wanted to see if it would be > suitable for further research. ?Now, in light of PEP 351, I'm less > convinced that it's so simple. ?Still, I've seen some ideas come back > from the proverbial grave and wanted to see if this one is not quite > dead, sir (or if I should avenge it ). > > -eric > > > [1] http://code.activestate.com/recipes/577895/ > [2] http://www.python.org/dev/peps/pep-0351/ > Okay, it's like Raymond was reading my mind in some of what he says here: http://mail.python.org/pipermail/python-dev/2006-February/060802.html Now I'm gonna have to go back consider if there is a better way to do the read-only bit I mentioned earlier (which motivated this post). My only hope is that he is talking about something else. :) Regardless, I suppose the recipe will stand or fall on its own anyway. -eric p.s. Greg Ewing also makes a pretty good point: http://mail.python.org/pipermail/python-dev/2006-February/060822.html From ericsnowcurrently at gmail.com Fri Oct 7 05:51:53 2011 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 6 Oct 2011 21:51:53 -0600 Subject: [Python-ideas] A Protocol for Making an Object Immutable In-Reply-To: References: Message-ID: On Thu, Oct 6, 2011 at 9:48 PM, Eric Snow wrote: > > Okay, it's like Raymond was reading my mind in some of what he says here: > > ?http://mail.python.org/pipermail/python-dev/2006-February/060802.html > > Now I'm gonna have to go back consider if there is a better way to do > the read-only bit I mentioned earlier (which motivated this post). ?My > only hope is that he is talking about something else. ?:) ?Regardless, > I suppose the recipe will stand or fall on its own anyway. > > -eric > > p.s. Greg Ewing also makes a pretty good point: > http://mail.python.org/pipermail/python-dev/2006-February/060822.html > And another one from Raymond: http://mail.python.org/pipermail/python-dev/2005-October/057586.html I'll stop now :) From greg.ewing at canterbury.ac.nz Fri Oct 7 07:06:25 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 07 Oct 2011 18:06:25 +1300 Subject: [Python-ideas] A Protocol for Making an Object Immutable In-Reply-To: References: Message-ID: <4E8E88D1.4010301@canterbury.ac.nz> Eric Snow wrote: > So, is being hashable necessarily > tied to being immutable? It's tied to immutability of those aspects involved in equality comparison. If a type is to behave predictably when used as a dict key, then two instances that compare equal must always compare equal, and the converse. Attributes that don't affect equality comparison are free to change, however. -- Greg From jkbbwr at gmail.com Fri Oct 7 12:24:05 2011 From: jkbbwr at gmail.com (Jakob Bowyer) Date: Fri, 7 Oct 2011 11:24:05 +0100 Subject: [Python-ideas] Support multiplication for sets Message-ID: Looking at this from a Math background, it seems that it would be nice for the set type to support multiplication. This would allow for the multiplication of sets to produce Cartesian products giving every single permutation. This would make set usage more intuitive for example; (assuming python3) a = set(["amy", "martin"]) b = set(["smith", "jones", "john"]) c = a * b print(c) set([('john', 'jones'), ('john', 'martin'), ('jones', 'john'), ('martin', 'amy'), ....]) This could be really easily achieved by giving a __mul__ method for sets. Currently trying to multiply sets gives a TypeError. Anyone got any views on this? Or am I barking up the wrong tree and saying something stupid. -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Fri Oct 7 12:35:54 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 7 Oct 2011 11:35:54 +0100 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: Message-ID: On 7 October 2011 11:24, Jakob Bowyer wrote: > Looking at this from a Math background, it seems that it would be nice for > the set type to support multiplication. This would allow for the > multiplication of sets to produce?Cartesian?products giving every single > permutation. This would make set usage more?intuitive?for example; > (assuming python3) > itertools.product does what you want already. >>> a = set((1,2,3)) >>> b = set((4,5,6)) >>> set(itertools.product(a,b)) {(2, 6), (1, 4), (1, 5), (1, 6), (3, 6), (2, 5), (3, 4), (2, 4), (3, 5)} I don't think an operator form is sufficiently valuable when the functionality is available and clear enough already. Paul. From jkbbwr at gmail.com Fri Oct 7 12:37:46 2011 From: jkbbwr at gmail.com (Jakob Bowyer) Date: Fri, 7 Oct 2011 11:37:46 +0100 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: Message-ID: There is that but from a math point of view the syntax a * b does make sence. Its slightly clearer and makes more sense to people from outside of a programming background. On Fri, Oct 7, 2011 at 11:35 AM, Paul Moore wrote: > On 7 October 2011 11:24, Jakob Bowyer wrote: > > Looking at this from a Math background, it seems that it would be nice > for > > the set type to support multiplication. This would allow for the > > multiplication of sets to produce Cartesian products giving every single > > permutation. This would make set usage more intuitive for example; > > (assuming python3) > > > > itertools.product does what you want already. > > >>> a = set((1,2,3)) > >>> b = set((4,5,6)) > >>> set(itertools.product(a,b)) > {(2, 6), (1, 4), (1, 5), (1, 6), (3, 6), (2, 5), (3, 4), (2, 4), (3, 5)} > > I don't think an operator form is sufficiently valuable when the > functionality is available and clear enough already. > > Paul. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dirkjan at ochtman.nl Fri Oct 7 12:43:21 2011 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Fri, 7 Oct 2011 12:43:21 +0200 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: Message-ID: On Fri, Oct 7, 2011 at 12:35, Paul Moore wrote: > I don't think an operator form is sufficiently valuable when the > functionality is available and clear enough already. I'm not sure; if set multiplication is highly unambiguous (i.e. the Cartesian product is the only logical outcome, and there is not some other common multiplication-like operation on sets), than it seems to me that supporting the multiplication operator for the Cartesian product of sets would be sensible. Cheers, Dirkjan From jkbbwr at gmail.com Fri Oct 7 12:46:34 2011 From: jkbbwr at gmail.com (Jakob Bowyer) Date: Fri, 7 Oct 2011 11:46:34 +0100 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: Message-ID: As far as I know and from asking my lecturer, multiplication only produces Cartesian products. On Fri, Oct 7, 2011 at 11:43 AM, Dirkjan Ochtman wrote: > On Fri, Oct 7, 2011 at 12:35, Paul Moore wrote: > > I don't think an operator form is sufficiently valuable when the > > functionality is available and clear enough already. > > I'm not sure; if set multiplication is highly unambiguous (i.e. the > Cartesian product is the only logical outcome, and there is not some > other common multiplication-like operation on sets), than it seems to > me that supporting the multiplication operator for the Cartesian > product of sets would be sensible. > > Cheers, > > Dirkjan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Oct 7 13:07:48 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 7 Oct 2011 13:07:48 +0200 Subject: [Python-ideas] Support multiplication for sets References: Message-ID: <20111007130748.058fe66d@pitrou.net> On Fri, 7 Oct 2011 11:46:34 +0100 Jakob Bowyer wrote: > As far as I know and from asking my lecturer, multiplication only > produces Cartesian products. Given that multiplying a list or tuple repeats the sequence, there may be a certain amount of confusion. Also, I don't think itertools.product is common enough to warrant an operator. There's a very readable alternative: >>> a = {"amy", "martin"} >>> b = {"smith", "jones", "john"} >>> {(u, v) for u in a for v in b} {('amy', 'john'), ('amy', 'jones'), ('martin', 'jones'), ('martin', 'smith'), ('martin', 'john'), ('amy', 'smith')} Or, in the case where you only want to iterate, two nested loops will suffice and avoid building the container. Regards Antoine. From p.f.moore at gmail.com Fri Oct 7 13:20:57 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 7 Oct 2011 12:20:57 +0100 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: Message-ID: On 7 October 2011 11:37, Jakob Bowyer wrote: > There is that but from a math point of view the syntax a * b does make > sence. > Its slightly clearer and makes more?sense?to people from outside of a > programming background. I'm not sure I'd agree, even though I come from a maths background. Explicit is better than implicit and all that... Even if it is slightly clearer to some people, I bet there are others (not from a mathematical background) who would be confused by it. And in that case, itertools.product is easier to google for than "*"...) And that's ignoring the cost of implementing, testing, documenting the change. Actually, just to give a flavour of the sorts of design decisions that would need to be considered, consider this: >>> a = set((1,2)) >>> b = set((3,4)) >>> c = set((5,6)) >>> from itertools import product >>> def times(s1,s2): ... return set(product(s1,s2)) ... >>> times(a,times(b,c)) {(1, (3, 6)), (2, (3, 5)), (2, (4, 5)), (1, (4, 6)), (1, (4, 5)), (2, (3, 6)), (2, (4, 6)), (1, (3, 5))} >>> times(times(a,b),c) {((2, 4), 6), ((1, 4), 5), ((1, 4), 6), ((2, 3), 6), ((1, 3), 6), ((2, 3), 5), ((2, 4), 5), ((1, 3), 5)} >>> So your multiplication isn't commutative (the types of the elements in the 2 expressions above are different). That doesn't seem intuitive - so maybe a*b*c should be a set of 3-tuples. But how would that work? The problem very quickly becomes a lot larger than you first assume. Operator overloading is used much more sparingly in Python than in, say, C++. It's as much a language style issue as anything else. Sorry, but I still don't see enough benefit to justify this. Paul. From jkbbwr at gmail.com Fri Oct 7 13:41:52 2011 From: jkbbwr at gmail.com (Jakob Bowyer) Date: Fri, 7 Oct 2011 12:41:52 +0100 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: Message-ID: Considering any multiplication action on a set is illegal. I don't think it will confuse anyone who doesn't know what a set is mathematically. On Fri, Oct 7, 2011 at 12:20 PM, Paul Moore wrote: > On 7 October 2011 11:37, Jakob Bowyer wrote: > > There is that but from a math point of view the syntax a * b does make > > sence. > > Its slightly clearer and makes more sense to people from outside of a > > programming background. > > I'm not sure I'd agree, even though I come from a maths background. > Explicit is better than implicit and all that... > > Even if it is slightly clearer to some people, I bet there are others > (not from a mathematical background) who would be confused by it. And > in that case, itertools.product is easier to google for than "*"...) > And that's ignoring the cost of implementing, testing, documenting the > change. > > Actually, just to give a flavour of the sorts of design decisions that > would need to be considered, consider this: > > >>> a = set((1,2)) > >>> b = set((3,4)) > >>> c = set((5,6)) > >>> from itertools import product > >>> def times(s1,s2): > ... return set(product(s1,s2)) > ... > >>> times(a,times(b,c)) > {(1, (3, 6)), (2, (3, 5)), (2, (4, 5)), (1, (4, 6)), (1, (4, 5)), (2, > (3, 6)), (2, (4, 6)), (1, (3, 5))} > >>> times(times(a,b),c) > {((2, 4), 6), ((1, 4), 5), ((1, 4), 6), ((2, 3), 6), ((1, 3), 6), ((2, > 3), 5), ((2, 4), 5), ((1, 3), 5)} > >>> > > So your multiplication isn't commutative (the types of the elements in > the 2 expressions above are different). That doesn't seem intuitive - > so maybe a*b*c should be a set of 3-tuples. But how would that work? > The problem very quickly becomes a lot larger than you first assume. > > Operator overloading is used much more sparingly in Python than in, > say, C++. It's as much a language style issue as anything else. > > Sorry, but I still don't see enough benefit to justify this. > > Paul. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From haoyi.sg at gmail.com Fri Oct 7 17:28:58 2011 From: haoyi.sg at gmail.com (Haoyi Li) Date: Fri, 7 Oct 2011 11:28:58 -0400 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: Message-ID: I don't think having itertools.product is a good reason for not overloading the operator. The same argument could be said against having listA + listB or listA * 10. After all, those can all be done with list comprehensions and itertools aswell. itertools.product(a, b) or a list comprehension work fine for 2 sets, but if you try doing that for any significant number (which i presume the OP is), maybe 5 set operations in one expression, it quickly becomes completely unreadable: itertools.product(itertools.product(seta, setb.union(setc), setd.difference(sete)) vs seta * (setb | setc) * (setd & sete) We already have operators overloaded for union |, intersect &, difference -, symmetric difference ^. Having an operator for product would fit in perfectly if it could be done properly; ensuring that it is commutative looks non-trivial. On Fri, Oct 7, 2011 at 7:41 AM, Jakob Bowyer wrote: > Considering any multiplication action on a set is illegal. I don't think it > will confuse anyone who doesn't know what a set is?mathematically. > On Fri, Oct 7, 2011 at 12:20 PM, Paul Moore wrote: >> >> On 7 October 2011 11:37, Jakob Bowyer wrote: >> > There is that but from a math point of view the syntax a * b does make >> > sence. >> > Its slightly clearer and makes more?sense?to people from outside of a >> > programming background. >> >> I'm not sure I'd agree, even though I come from a maths background. >> Explicit is better than implicit and all that... >> >> Even if it is slightly clearer to some people, I bet there are others >> (not from a mathematical background) who would be confused by it. And >> in that case, itertools.product is easier to google for than "*"...) >> And that's ignoring the cost of implementing, testing, documenting the >> change. >> >> Actually, just to give a flavour of the sorts of design decisions that >> would need to be considered, consider this: >> >> >>> a = set((1,2)) >> >>> b = set((3,4)) >> >>> c = set((5,6)) >> >>> from itertools import product >> >>> def times(s1,s2): >> ... ? ?return set(product(s1,s2)) >> ... >> >>> times(a,times(b,c)) >> {(1, (3, 6)), (2, (3, 5)), (2, (4, 5)), (1, (4, 6)), (1, (4, 5)), (2, >> (3, 6)), (2, (4, 6)), (1, (3, 5))} >> >>> times(times(a,b),c) >> {((2, 4), 6), ((1, 4), 5), ((1, 4), 6), ((2, 3), 6), ((1, 3), 6), ((2, >> 3), 5), ((2, 4), 5), ((1, 3), 5)} >> >>> >> >> So your multiplication isn't commutative (the types of the elements in >> the 2 expressions above are different). That doesn't seem intuitive - >> so maybe a*b*c should be a set of 3-tuples. But how would that work? >> The problem very quickly becomes a lot larger than you first assume. >> >> Operator overloading is used much more sparingly in Python than in, >> say, C++. It's as much a language style issue as anything else. >> >> Sorry, but I still don't see enough benefit to justify this. >> >> Paul. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > From amauryfa at gmail.com Fri Oct 7 17:36:20 2011 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Fri, 7 Oct 2011 17:36:20 +0200 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: Message-ID: 2011/10/7 Jakob Bowyer : > Looking at this from a Math background, it seems that it would be nice for > the set type to support multiplication. This would allow for the > multiplication of sets to produce?Cartesian?products giving every single > permutation. This would make set usage more?intuitive?for example; > (assuming python3) > > a = set(["amy", "martin"]) > b = set(["smith", "jones", "john"]) > c = a * b > print(c) > set([('john', 'jones'), > ? ? ?('john', 'martin'), > ? ? ?('jones', 'john'), > ? ? ?('martin', 'amy'), > ? ? ?....]) Is your example correct? It does not look like a cartesian product to me. and what about writing it this way: {(x,y) for x in a for y in b} -- Amaury Forgeot d'Arc From ironfroggy at gmail.com Fri Oct 7 17:55:32 2011 From: ironfroggy at gmail.com (Calvin Spealman) Date: Fri, 7 Oct 2011 11:55:32 -0400 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: Message-ID: On Fri, Oct 7, 2011 at 6:37 AM, Jakob Bowyer wrote: > > There is that but from a math point of view the syntax a * b does make sence. > Its slightly clearer and makes more?sense?to **people from outside of a programming background**. (emphasis added) These are not the only people writing and reading code, and decisions about syntax should favor improving the readability for coders across the board, not simply a single subset of them. > > On Fri, Oct 7, 2011 at 11:35 AM, Paul Moore wrote: >> >> On 7 October 2011 11:24, Jakob Bowyer wrote: >> > Looking at this from a Math background, it seems that it would be nice for >> > the set type to support multiplication. This would allow for the >> > multiplication of sets to produce?Cartesian?products giving every single >> > permutation. This would make set usage more?intuitive?for example; >> > (assuming python3) >> > >> >> itertools.product does what you want already. >> >> >>> a = set((1,2,3)) >> >>> b = set((4,5,6)) >> >>> set(itertools.product(a,b)) >> {(2, 6), (1, 4), (1, 5), (1, 6), (3, 6), (2, 5), (3, 4), (2, 4), (3, 5)} >> >> I don't think an operator form is sufficiently valuable when the >> functionality is available and clear enough already. >> >> Paul. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy From mwm at mired.org Fri Oct 7 18:08:05 2011 From: mwm at mired.org (Mike Meyer) Date: Fri, 7 Oct 2011 09:08:05 -0700 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: Message-ID: On Fri, Oct 7, 2011 at 8:55 AM, Calvin Spealman wrote: > On Fri, Oct 7, 2011 at 6:37 AM, Jakob Bowyer wrote: > > > > There is that but from a math point of view the syntax a * b does make > sence. > > Its slightly clearer and makes more sense to **people from outside of a > programming background**. > > (emphasis added) > > These are not the only people writing and reading code, and decisions > about syntax should favor improving the readability > for coders across the board, not simply a single subset of them. > True. But unless there's another common meaning for multiplying sets, there are only two groups of people to consider: Those who know it as the cross product, and those who have no idea what it might mean. They former will be surprised by the current situation when it doesn't work, the latter will have to look it up when they run into it. it's not really any worse than using + for string concatenation. Except for the associativity issue. From mikegraham at gmail.com Fri Oct 7 19:08:12 2011 From: mikegraham at gmail.com (Mike Graham) Date: Fri, 7 Oct 2011 13:08:12 -0400 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: Message-ID: On Fri, Oct 7, 2011 at 6:24 AM, Jakob Bowyer wrote: > Looking at this from a Math background, it seems that it would be nice for > the set type to support multiplication. This would allow for the > multiplication of sets to produce Cartesian products giving every single > permutation. This would make set usage more intuitive for example; > (assuming python3) > > a = set(["amy", "martin"]) > b = set(["smith", "jones", "john"]) > c = a * b > print(c) > > set([('john', 'jones'), > ('john', 'martin'), > ('jones', 'john'), > ('martin', 'amy'), > ....]) > > This could be really easily achieved by giving a __mul__ method for sets. > Currently trying to multiply sets gives a TypeError. Anyone got any views on > this? Or am I barking up the wrong tree and saying something stupid. > > This idea might be aesthetically pleasing from a mathematical viewpoint, but it does not help people write better programs. It does not provide anything better than the status quo. In fact, it adds an obscure behavior that needs to be maintained, taught, and understood, making Python ever-so-slightly worse. -1 Mike -------------- next part -------------- An HTML attachment was scrubbed... URL: From turnbull at sk.tsukuba.ac.jp Fri Oct 7 19:15:44 2011 From: turnbull at sk.tsukuba.ac.jp (Stephen J. Turnbull) Date: Sat, 08 Oct 2011 02:15:44 +0900 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: Message-ID: <87sjn4radb.fsf@uwakimon.sk.tsukuba.ac.jp> Paul Moore writes: > So your multiplication isn't commutative (the types of the elements in > the 2 expressions above are different). No, set multiplication is not going to be commutative: {'a'} X {1} is quite different from {1} X {'a'}. The problem you're pointing out is much worse: the obvious implementation of Cartesian product isn't even associative. -1 on an operator for this. From raymond.hettinger at gmail.com Fri Oct 7 19:59:56 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Fri, 7 Oct 2011 13:59:56 -0400 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: Message-ID: On Oct 7, 2011, at 6:24 AM, Jakob Bowyer wrote: > Looking at this from a Math background, it seems that it would be nice for the set type to support multiplication. This would allow for the multiplication of sets to produce Cartesian products giving every single permutation. -1 We already have multiple ways to do it (set comprehensions, itertools.product, ...). Also, it's much nicer to have an iterator than to fill memory with lots of little sets. Also, it is unclear what s*s*s should do. Probably, the user would expect {(a,a,a), (a,a,b), ..} but the way you've proposed it, they would get {((a,a),a), ((a,a),b), ...} and have an unpleasant surprise. Raymond From p.f.moore at gmail.com Fri Oct 7 20:13:17 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 7 Oct 2011 19:13:17 +0100 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: <87sjn4radb.fsf@uwakimon.sk.tsukuba.ac.jp> References: <87sjn4radb.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On 7 October 2011 18:15, Stephen J. Turnbull wrote: > Paul Moore writes: > > ?> So your multiplication isn't commutative (the types of the elements in > ?> the 2 expressions above are different). > > No, set multiplication is not going to be commutative: {'a'} X {1} is > quite different from {1} X {'a'}. ?The problem you're pointing out is > much worse: the obvious implementation of Cartesian product isn't even > associative. Bah. For all my claims of having a mathematical background (OK, so it was 30 years ago :-)) I can't even remember the difference between associativity and commutativity. I'm going for a lie down now... :-) Paul. From jimjjewett at gmail.com Fri Oct 7 21:18:51 2011 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri, 7 Oct 2011 15:18:51 -0400 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> Message-ID: On Thu, Oct 6, 2011 at 6:04 PM, Terry Reedy wrote: > On 10/6/2011 4:32 PM, Jim Jewett wrote: >> On Wed, Oct 5, 2011 at 5:17 PM, Terry Reedy ?wrote: >>> On 10/4/2011 10:21 PM, Guido van Rossum wrote: >>>> We also have str.index which raised an exception, but people dislike >>>> writing try/except blocks. >>> ... try/except blocks are routinely used for flow control in Python >>> ... even advocate using them over if/else (leap first) >> str.index is a "little" method that it is tempting to use inline, even >> as part of a comprehension. > That is an argument *for* raising an exception on error. Only for something that is truly an unexpected error. Bad or missing data should not prevent the program from processing what it can. When I want an inline catch, it always meets the following criteria: (a) The "exception" is actually expected, at least occasionally. (b) The exception is caused by (bad/missing/irrelevant...) input -- nothing is wrong with my computational environment. (c) I do NOT need extra user input; I already know what to do with it. Typically, I just filter it out, though I may replace it with a placeholder and/or echo it to another output stream instead. (d) The algorithm SHOULD continue to process the remaining (mostly good) data. Sometimes, the "bad" data is itself in a known format (like a "." instead of a number); but ... not always. -jJ From dmascialino at gmail.com Fri Oct 7 21:53:53 2011 From: dmascialino at gmail.com (Diego Mascialino) Date: Fri, 7 Oct 2011 16:53:53 -0300 Subject: [Python-ideas] Proporsal: Show an alert in traceback when the source file is newer than compiled code (issue8087) Message-ID: Hello, I opened this issue a long time ago, and ?ric Araujo said that it should be discussed on this list. This is my feature request message: Example: ---------------- mod.py ---------------- def f(): a,b,c = 1,2 print b ---------------------------------------- If i do: >>> import mod >>> mod.f() I get: Traceback (most recent call last): File "", line 1, in File "mod.py", line 2, in f a,b,c = 1,2 ValueError: need more than 2 values to unpack If i fix the source: ---------------- mod.py ---------------- def f(): a,b,c = 1,2,3 print b ---------------------------------------- And do: >>> mod.f() I get: Traceback (most recent call last): File "", line 1, in File "mod.py", line 2, in f a,b,c = 1,2,3 ValueError: need more than 2 values to unpack The problem is that the source shown is updated, but the executed code is old, because it wasn't reloaded. Feature request: If the source code shown was modified after import time and it wasn't reloaded, a warning message should be shown. Example: Traceback (most recent call last): File "", line 1, in File "mod.py", line 2, in f WARNING: Modified after import! a,b,c = 1,2,3 ValueError: need more than 2 values to unpack or something like that. Thanks, Diego Mascialino From jkbbwr at gmail.com Fri Oct 7 22:54:00 2011 From: jkbbwr at gmail.com (Jakob Bowyer) Date: Fri, 7 Oct 2011 21:54:00 +0100 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: <87sjn4radb.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: Well it was fun watching the process that is python-ideas and the shooting down in flames that happens here. Can someone give me some advice about what to think about/do for the next time that an idea comes to mind? On Fri, Oct 7, 2011 at 7:13 PM, Paul Moore wrote: > On 7 October 2011 18:15, Stephen J. Turnbull > wrote: > > Paul Moore writes: > > > > > So your multiplication isn't commutative (the types of the elements in > > > the 2 expressions above are different). > > > > No, set multiplication is not going to be commutative: {'a'} X {1} is > > quite different from {1} X {'a'}. The problem you're pointing out is > > much worse: the obvious implementation of Cartesian product isn't even > > associative. > > Bah. For all my claims of having a mathematical background (OK, so it > was 30 years ago :-)) I can't even remember the difference between > associativity and commutativity. > > I'm going for a lie down now... :-) > Paul. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Fri Oct 7 22:57:19 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 07 Oct 2011 13:57:19 -0700 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: <87sjn4radb.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4E8F67AF.4070301@stoneleaf.us> Jakob Bowyer wrote: > Well it was fun watching the process that is python-ideas and the > shooting down in flames that happens here. Can someone give me some > advice about what to think about/do for the next time that an idea comes > to mind? Just what you did this time. As the tagline says, "In order for there to be good ideas, there must first be lots of ideas." (No, I don't remember who said it, or even who's tagline it is.) ~Ethan~ From p.f.moore at gmail.com Fri Oct 7 23:21:01 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 7 Oct 2011 22:21:01 +0100 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: <87sjn4radb.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On 7 October 2011 21:54, Jakob Bowyer wrote: > Well it was fun watching the process that is python-ideas and the shooting > down in flames that happens here. Can someone give me some advice about what > to think about/do for the next time that an idea comes to mind? About the same, but be prepared for not everyone to be as enthusiastic about your idea as you are, and be open to the possibility that some of the objections are valid... (BTW, I thought I was offering you some insight into why ideas might not be as straightforward as the inventor thought. I apologise if it came across to you as me "shooting down in flames" your idea...) Paul. From jkbbwr at gmail.com Fri Oct 7 23:23:29 2011 From: jkbbwr at gmail.com (Jakob Bowyer) Date: Fri, 7 Oct 2011 22:23:29 +0100 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: <87sjn4radb.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: Actually I like this style :) don't mistake my "shooting down in flames" to be aggressive dislike. I have taken your specific advice on board, I was asking in a more general context. So I'm actually quite happy this worked out this way gives me a chance to learn On Fri, Oct 7, 2011 at 10:21 PM, Paul Moore wrote: > On 7 October 2011 21:54, Jakob Bowyer wrote: > > Well it was fun watching the process that is python-ideas and the > shooting > > down in flames that happens here. Can someone give me some advice about > what > > to think about/do for the next time that an idea comes to mind? > > About the same, but be prepared for not everyone to be as enthusiastic > about your idea as you are, and be open to the possibility that some > of the objections are valid... > > (BTW, I thought I was offering you some insight into why ideas might > not be as straightforward as the inventor thought. I apologise if it > came across to you as me "shooting down in flames" your idea...) > > Paul. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Oct 7 23:29:32 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 7 Oct 2011 17:29:32 -0400 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: <87sjn4radb.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Fri, Oct 7, 2011 at 2:13 PM, Paul Moore wrote: > On 7 October 2011 18:15, Stephen J. Turnbull wrote: >> Paul Moore writes: >> >> ?> So your multiplication isn't commutative (the types of the elements in >> ?> the 2 expressions above are different). >> >> No, set multiplication is not going to be commutative: {'a'} X {1} is >> quite different from {1} X {'a'}. ?The problem you're pointing out is >> much worse: the obvious implementation of Cartesian product isn't even >> associative. > > Bah. For all my claims of having a mathematical background (OK, so it > was 30 years ago :-)) I can't even remember the difference between > associativity and commutativity. Associative: A * (B * C) == (A * B) * C Commutative: A * B == B * A Set multiplication as the Cartesian product is neither (as others have pointed out). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From mwm at mired.org Fri Oct 7 23:30:22 2011 From: mwm at mired.org (Mike Meyer) Date: Fri, 7 Oct 2011 14:30:22 -0700 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: <4E8F67AF.4070301@stoneleaf.us> References: <87sjn4radb.fsf@uwakimon.sk.tsukuba.ac.jp> <4E8F67AF.4070301@stoneleaf.us> Message-ID: On Fri, Oct 7, 2011 at 1:57 PM, Ethan Furman wrote: > Jakob Bowyer wrote: > >> Well it was fun watching the process that is python-ideas and the shooting >> down in flames that happens here. Can someone give me some advice about what >> to think about/do for the next time that an idea comes to mind? >> > > Just what you did this time. As the tagline says, "In order for there to > be good ideas, there must first be lots of ideas." (No, I don't remember > who said it, or even who's tagline it is.) > Actually, that question makes me think that a meta-PEP on the process of vetting an idea before starting a real PEP might be worthwhile. In particular, that people should come up with real-world use cases and try and think of the worst abuses that might occur, etc. From greg.ewing at canterbury.ac.nz Fri Oct 7 23:33:05 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 08 Oct 2011 10:33:05 +1300 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: Message-ID: <4E8F7011.4060702@canterbury.ac.nz> Jakob Bowyer wrote: > There is that but from a math point of view the syntax a * b does make > sence. A problem with this is that it doesn't generalise smoothly to products of more than two sets. A mathematician would think of A x B x C as a set of 3-tuples, but in Python, A * B * C implemented the straightforward way would give you a set of 2-tuples, one element of which is another 2-tuple. Keeping it as a function allows products of arbitrarily many sets to be expressed naturally. -- Greg From ncoghlan at gmail.com Fri Oct 7 23:35:18 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 7 Oct 2011 17:35:18 -0400 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: Message-ID: On Fri, Oct 7, 2011 at 11:28 AM, Haoyi Li wrote: > I don't think having itertools.product is a good reason for not > overloading the operator. The same argument could be said against > having listA + listB or listA * 10. After all, those can all be done > with list comprehensions and itertools aswell. > > itertools.product(a, b) or a list comprehension work fine for 2 sets, > but if you try doing that for any significant number (which i presume > the OP is), maybe 5 set operations in one expression, it quickly > becomes completely unreadable: > > itertools.product(itertools.product(seta, setb.union(setc), > setd.difference(sete)) > > vs > > seta * (setb | setc) * (setd & sete) I expect what you actually intended here was: cp = product(seta, setb|setc, setd&sete) If you actually did want two distinct cartesian products, then the two line version is significantly easier to read: cp_a = product(setb|setc, setd&sete) cp_b = product(seta, cp_a) The ambiguity of chained multiplication is more than enough to kill the idea (although it was definitely worth asking the question). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Fri Oct 7 23:37:54 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 7 Oct 2011 17:37:54 -0400 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: <87sjn4radb.fsf@uwakimon.sk.tsukuba.ac.jp> <4E8F67AF.4070301@stoneleaf.us> Message-ID: On Fri, Oct 7, 2011 at 5:30 PM, Mike Meyer wrote: > Actually, that question makes me think that a meta-PEP on the process of > vetting an idea before starting a real PEP might be worthwhile. In > particular, that people should come up with real-world use cases and try and > think of the worst abuses that might occur, etc. The process is basically 'ask on python-ideas if a web search doesn't show that it has already been proposed'. The next step will vary from "not going to happen" through "file a feature request on the tracker" to "write a PEP". Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From mwm at mired.org Fri Oct 7 23:51:53 2011 From: mwm at mired.org (Mike Meyer) Date: Fri, 7 Oct 2011 14:51:53 -0700 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: <87sjn4radb.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Fri, Oct 7, 2011 at 2:29 PM, Nick Coghlan wrote: > On Fri, Oct 7, 2011 at 2:13 PM, Paul Moore wrote: > > On 7 October 2011 18:15, Stephen J. Turnbull > wrote: > >> Paul Moore writes: > >> > >> > So your multiplication isn't commutative (the types of the elements > in > >> > the 2 expressions above are different). > >> > >> No, set multiplication is not going to be commutative: {'a'} X {1} is > >> quite different from {1} X {'a'}. The problem you're pointing out is > >> much worse: the obvious implementation of Cartesian product isn't even > >> associative. > > > > Bah. For all my claims of having a mathematical background (OK, so it > > was 30 years ago :-)) I can't even remember the difference between > > associativity and commutativity. > > Associative: A * (B * C) == (A * B) * C > Commutative: A * B == B * A > Associative is a problem. Especially because there are two reasonable interpretations of it ( (a, (b, c)) vs. (a, b, c)). Commutative, not so much. We already have a number of non-commutative operators (-, /, // and % on most types), some of which behave that way in the real world of mathematics. We also have operators that are commutative on integers but not on other things (+ on lists and string). From greg.ewing at canterbury.ac.nz Fri Oct 7 23:53:24 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 08 Oct 2011 10:53:24 +1300 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: Message-ID: <4E8F74D4.3000601@canterbury.ac.nz> Mike Meyer wrote: > But unless there's another common meaning for multiplying sets, While there might not be another common meaning for multiplying sets, it's not necessarily obvious that '*' applied to sets in a programming language means 'multiplication'. For example, Pascal uses '+' and '*' for set union and intersection, IIRC. -- Greg From mwm at mired.org Fri Oct 7 23:54:32 2011 From: mwm at mired.org (Mike Meyer) Date: Fri, 7 Oct 2011 14:54:32 -0700 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: <87sjn4radb.fsf@uwakimon.sk.tsukuba.ac.jp> <4E8F67AF.4070301@stoneleaf.us> Message-ID: On Fri, Oct 7, 2011 at 2:37 PM, Nick Coghlan wrote: > On Fri, Oct 7, 2011 at 5:30 PM, Mike Meyer wrote: > > Actually, that question makes me think that a meta-PEP on the process of > > vetting an idea before starting a real PEP might be worthwhile. In > > particular, that people should come up with real-world use cases and try > and > > think of the worst abuses that might occur, etc. > > The process is basically 'ask on python-ideas if a web search doesn't > show that it has already been proposed'. The next step will vary from > "not going to happen" through "file a feature request on the tracker" > to "write a PEP". > True, but not very useful. The idea would be to discuss what happens between posting and "the next step". People may well ask for use cases, look at abuses, do searches of the library for them (or ask you to do so), etc. From fuzzyman at gmail.com Fri Oct 7 23:54:40 2011 From: fuzzyman at gmail.com (Michael Foord) Date: Fri, 7 Oct 2011 22:54:40 +0100 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> Message-ID: On 7 October 2011 20:18, Jim Jewett wrote: > On Thu, Oct 6, 2011 at 6:04 PM, Terry Reedy wrote: > > On 10/6/2011 4:32 PM, Jim Jewett wrote: > >> On Wed, Oct 5, 2011 at 5:17 PM, Terry Reedy wrote: > >>> On 10/4/2011 10:21 PM, Guido van Rossum wrote: > > >>>> We also have str.index which raised an exception, but people dislike > >>>> writing try/except blocks. > > >>> ... try/except blocks are routinely used for flow control in Python > >>> ... even advocate using them over if/else (leap first) > > >> str.index is a "little" method that it is tempting to use inline, even > >> as part of a comprehension. > > > That is an argument *for* raising an exception on error. > > Only for something that is truly an unexpected error. Bad or missing > data should not prevent the program from processing what it can. > > When I want an inline catch, it always meets the following criteria: > > (a) The "exception" is actually expected, at least occasionally. > (b) The exception is caused by (bad/missing/irrelevant...) input -- > nothing is wrong with my computational environment. > (c) I do NOT need extra user input; I already know what to do with it. > > Typically, I just filter it out, though I may replace it with a > placeholder and/or echo it to another output stream instead. > > (d) The algorithm SHOULD continue to process the remaining (mostly good) > data. > > > Sometimes, the "bad" data is itself in a known format (like a "." > instead of a number); but ... not always. > > Yeah, I've quite often worked on data sets where you just need to process what you can and ignore (or replace with placeholders) what you can't. Michael Foord > -jJ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From fuzzyman at gmail.com Sat Oct 8 00:59:42 2011 From: fuzzyman at gmail.com (Michael Foord) Date: Fri, 7 Oct 2011 23:59:42 +0100 Subject: [Python-ideas] Proporsal: Show an alert in traceback when the source file is newer than compiled code (issue8087) In-Reply-To: References: Message-ID: On 7 October 2011 20:53, Diego Mascialino wrote: > Hello, I opened this issue a long time ago, and ?ric Araujo said that > it should be discussed on this list. > > This is my feature request message: > > > Example: > > ---------------- mod.py ---------------- > def f(): > a,b,c = 1,2 > print b > ---------------------------------------- > > If i do: > > >>> import mod > >>> mod.f() > > I get: > > Traceback (most recent call last): > File "", line 1, in > File "mod.py", line 2, in f > a,b,c = 1,2 > ValueError: need more than 2 values to unpack > > > If i fix the source: > > ---------------- mod.py ---------------- > def f(): > a,b,c = 1,2,3 > print b > ---------------------------------------- > > And do: > > >>> mod.f() > > I get: > > Traceback (most recent call last): > File "", line 1, in > File "mod.py", line 2, in f > a,b,c = 1,2,3 > ValueError: need more than 2 values to unpack > > > The problem is that the source shown is updated, but the executed code > is old, because it wasn't reloaded. > > Feature request: > > If the source code shown was modified after import time and it wasn't > reloaded, a warning message should be shown. > > Example: > > Traceback (most recent call last): > File "", line 1, in > File "mod.py", line 2, in f WARNING: Modified after import! > a,b,c = 1,2,3 > ValueError: need more than 2 values to unpack > > or something like that. > That would mean every function call would need to check the age of the source file and bytecode file on disk and compare! Michael > > > Thanks, > Diego Mascialino > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From cs at zip.com.au Sat Oct 8 01:04:25 2011 From: cs at zip.com.au (Cameron Simpson) Date: Sat, 8 Oct 2011 10:04:25 +1100 Subject: [Python-ideas] Proporsal: Show an alert in traceback when the source file is newer than compiled code (issue8087) In-Reply-To: References: Message-ID: <20111007230425.GA10077@cskk.homeip.net> On 07Oct2011 23:59, Michael Foord wrote: | On 7 October 2011 20:53, Diego Mascialino wrote: | > Hello, I opened this issue a long time ago, and ?ric Araujo said that | > it should be discussed on this list. [...] | > The problem is that the source shown is updated, but the executed code | > is old, because it wasn't reloaded. | > | > Feature request: | > | > If the source code shown was modified after import time and it wasn't | > reloaded, a warning message should be shown. | > | > Example: | > | > Traceback (most recent call last): | > File "", line 1, in | > File "mod.py", line 2, in f WARNING: Modified after import! | > a,b,c = 1,2,3 | > ValueError: need more than 2 values to unpack | > | > or something like that. | > | | That would mean every function call would need to check the age of the | source file and bytecode file on disk and compare! Only when the traceback was being printed. You would need to stash the modtime at import for later reference. Cheers, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ Ed Campbell's pointers for long trips: 2. Figure out the most money you could possibly spend, and take at least double. From steve at pearwood.info Sat Oct 8 02:23:18 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 08 Oct 2011 11:23:18 +1100 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: Message-ID: <4E8F97F6.6000709@pearwood.info> Jakob Bowyer wrote: > There is that but from a math point of view the syntax a * b does make > sence. > Its slightly clearer and makes more sense to people from outside of a > programming background. I realise that the consensus is that the lack of associativity is a fatal problem with a Cartesian product operator, but there are at least two other issues I haven't seen. (1) "Using * for set product makes sense to mathematicians" -- maybe so, but those mathematicians already have to learn to use | instead of ? (union) and & instead of ? (intersection), so learning to use itertools.product() for Cartesian product is not a major burden for them. (2) Cartesian product is potentially very expensive. The Cartesian product of a moderate-sized set and another moderate-sized set could turn out to be a HUGE set. This is not a fatal objection, since other operations in Python are potentially expensive: alist*10000000 but at least it looks expensive. You're multiplying by a big number, of course you're going to require a lot of memory. But set multiplication can very easily creep up on you: aset*bset will have size len(aset)*len(bset) which may be huge even if neither set on their own is. Better to keep it as a lazy iterator rather than try to generate a potentially huge set in one go. -- Steven From python at mrabarnett.plus.com Sat Oct 8 02:39:33 2011 From: python at mrabarnett.plus.com (MRAB) Date: Sat, 08 Oct 2011 01:39:33 +0100 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: <4E8F97F6.6000709@pearwood.info> References: <4E8F97F6.6000709@pearwood.info> Message-ID: <4E8F9BC5.8080802@mrabarnett.plus.com> On 08/10/2011 01:23, Steven D'Aprano wrote: > Jakob Bowyer wrote: >> There is that but from a math point of view the syntax a * b does make >> sence. >> Its slightly clearer and makes more sense to people from outside of a >> programming background. > > > I realise that the consensus is that the lack of associativity is a > fatal problem with a Cartesian product operator, but there are at least > two other issues I haven't seen. > > (1) "Using * for set product makes sense to mathematicians" -- maybe so, > but those mathematicians already have to learn to use | instead of ? > (union) and & instead of ? (intersection), so learning to use > itertools.product() for Cartesian product is not a major burden for them. > [snip] Not to mention = and ==. From tjreedy at udel.edu Sat Oct 8 02:57:33 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 07 Oct 2011 20:57:33 -0400 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: Message-ID: On 10/7/2011 7:20 AM, Paul Moore wrote: > On 7 October 2011 11:37, Jakob Bowyer wrote: >> There is that but from a math point of view the syntax a * b does make >> sence. >> Its slightly clearer and makes more sense to people from outside of a >> programming background. Math used a different symbol, an 'X' without serifs, for cross-products. The result is a set of 'ordered pairs', which is different from a duple. > I'm not sure I'd agree, even though I come from a maths background. > Explicit is better than implicit and all that... > > Even if it is slightly clearer to some people, I bet there are others > (not from a mathematical background) who would be confused by it. And > in that case, itertools.product is easier to google for than "*"...) > And that's ignoring the cost of implementing, testing, documenting the > change. > > Actually, just to give a flavour of the sorts of design decisions that > would need to be considered, consider this: > >>>> a = set((1,2)) >>>> b = set((3,4)) >>>> c = set((5,6)) >>>> from itertools import product >>>> def times(s1,s2): > ... return set(product(s1,s2)) > ... >>>> times(a,times(b,c)) > {(1, (3, 6)), (2, (3, 5)), (2, (4, 5)), (1, (4, 6)), (1, (4, 5)), (2, > (3, 6)), (2, (4, 6)), (1, (3, 5))} >>>> times(times(a,b),c) > {((2, 4), 6), ((1, 4), 5), ((1, 4), 6), ((2, 3), 6), ((1, 3), 6), ((2, > 3), 5), ((2, 4), 5), ((1, 3), 5)} >>>> > > So your multiplication isn't commutative It is not *associative* -- unless one special-cases tuples to add non-tuple elements to tuple elements. > (the types of the elements in > the 2 expressions above are different). If the elements of A*B are sequences, then A*B is also not commutative. But it would be if the elements were sets instead of pairs. > That doesn't seem intuitive - > so maybe a*b*c should be a set of 3-tuples. But how would that work? In math, *...* is a ternary operator with 3 args, like if...else in Python or ;...: in C, but it generalizes to an n-ary operator. From https://secure.wikimedia.org/wikipedia/en/wiki/Cartesian_product ''' The Cartesian product can be generalized to the n-ary Cartesian product over n sets X1, ..., Xn: X_1\times\cdots\times X_n = \{(x_1, \ldots, x_n) : x_i \in X_i \}. It is a set of n-tuples. If tuples are defined as nested ordered pairs, it can be identified to (X1 ? ... ? Xn-1) ? Xn.''' In other words, think of aXbXc as XX(a,b,c) and similarly for more Xs. In Python, better to define XX explicitly. One can even write the n-fold generalization by simulating n nested for loops. > The problem very quickly becomes a lot larger than you first assume. > > Operator overloading is used much more sparingly in Python than in, > say, C++. It's as much a language style issue as anything else. > > Sorry, but I still don't see enough benefit to justify this. In most situations, one really needs an iterator that produces the pairs one at a time rather than a complete collection. And when one does want a complete collection, one might often want it ordered as a list rather than unordered as a set. Itertools.product covers all these use cases. And even that does not cover the n-ary case. For many combinatorial algorithms, one need a 'cross-concatenation' operator defined on collections of sequences which adds the pair of sequences in each cross-product pair. The same 'x' symbol, possibly circled, has been used for that. Both are in unicode, but Python is currently restricted to a sparse set of ascii symbols. -- Terry Jan Reedy From guido at python.org Sat Oct 8 03:02:35 2011 From: guido at python.org (Guido van Rossum) Date: Fri, 7 Oct 2011 18:02:35 -0700 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: <4E8F97F6.6000709@pearwood.info> References: <4E8F97F6.6000709@pearwood.info> Message-ID: On Fri, Oct 7, 2011 at 5:23 PM, Steven D'Aprano wrote: > (2) Cartesian product is potentially very expensive. The Cartesian product > of a moderate-sized set and another moderate-sized set could turn out to be > a HUGE set. > > This is not a fatal objection, since other operations in Python are > potentially expensive: > > alist*10000000 > > but at least it looks expensive. You're multiplying by a big number, of > course you're going to require a lot of memory. But set multiplication can > very easily creep up on you: > > aset*bset > > will have size len(aset)*len(bset) which may be huge even if neither set on > their own is. Better to keep it as a lazy iterator rather than try to > generate a potentially huge set in one go. I'm not defending the Cartesian product proposal, but this argument is just silly. What if the first example was written alist * n ? Does that look expensive? -- --Guido van Rossum (python.org/~guido) From tjreedy at udel.edu Sat Oct 8 03:04:02 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 07 Oct 2011 21:04:02 -0400 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: Message-ID: On 10/7/2011 8:57 PM, Terry Reedy wrote: > On 10/7/2011 7:20 AM, Paul Moore wrote: >> On 7 October 2011 11:37, Jakob >> Bowyer wrote: >>> There is that but from a math point of view the syntax a * b does make >>> sence. >>> Its slightly clearer and makes more sense to people from outside of a >>> programming background. > > Math used a different symbol, an 'X' without serifs, for cross-products. > The result is a set of 'ordered pairs', which is different from a duple. > >> I'm not sure I'd agree, even though I come from a maths background. >> Explicit is better than implicit and all that... >> >> Even if it is slightly clearer to some people, I bet there are others >> (not from a mathematical background) who would be confused by it. And >> in that case, itertools.product is easier to google for than "*"...) >> And that's ignoring the cost of implementing, testing, documenting the >> change. >> >> Actually, just to give a flavour of the sorts of design decisions that >> would need to be considered, consider this: >> >>>>> a = set((1,2)) >>>>> b = set((3,4)) >>>>> c = set((5,6)) >>>>> from itertools import product >>>>> def times(s1,s2): >> ... return set(product(s1,s2)) >> ... >>>>> times(a,times(b,c)) >> {(1, (3, 6)), (2, (3, 5)), (2, (4, 5)), (1, (4, 6)), (1, (4, 5)), (2, >> (3, 6)), (2, (4, 6)), (1, (3, 5))} >>>>> times(times(a,b),c) >> {((2, 4), 6), ((1, 4), 5), ((1, 4), 6), ((2, 3), 6), ((1, 3), 6), ((2, >> 3), 5), ((2, 4), 5), ((1, 3), 5)} >>>>> >> >> So your multiplication isn't commutative > > It is not *associative* -- unless one special-cases tuples to add > non-tuple elements to tuple elements. > >> (the types of the elements in >> the 2 expressions above are different). > > If the elements of A*B are sequences, then A*B is also not commutative. > But it would be if the elements were sets instead of pairs. > >> That doesn't seem intuitive - >> so maybe a*b*c should be a set of 3-tuples. But how would that work? > > In math, *...* is a ternary operator with 3 args, like if...else in > Python or ;...: in C, but it generalizes to an n-ary operator. From > https://secure.wikimedia.org/wikipedia/en/wiki/Cartesian_product > ''' > The Cartesian product can be generalized to the n-ary Cartesian product > over n sets X1, ..., Xn: > > X_1\times\cdots\times X_n = \{(x_1, \ldots, x_n) : x_i \in X_i \}. > > It is a set of n-tuples. If tuples are defined as nested ordered pairs, > it can be identified to (X1 ? ... ? Xn-1) ? Xn.''' > > In other words, think of aXbXc as XX(a,b,c) and similarly for more Xs. > In Python, better to define XX explicitly. One can even write the n-fold > generalization by simulating n nested for loops. And itertools.product already does this. >> The problem very quickly becomes a lot larger than you first assume. >> >> Operator overloading is used much more sparingly in Python than in, >> say, C++. It's as much a language style issue as anything else. >> >> Sorry, but I still don't see enough benefit to justify this. > > In most situations, one really needs an iterator that produces the pairs > one at a time rather than a complete collection. And when one does want > a complete collection, one might often want it ordered as a list rather > than unordered as a set. Itertools.product covers all these use cases. > And even that does not cover the n-ary case. And itertools.product *does* cover the n-ary case. Sorry for the apparent error. > For many combinatorial algorithms, one need a 'cross-concatenation' > operator defined on collections of sequences which adds the pair of > sequences in each cross-product pair. The same 'x' symbol, possibly > circled, has been used for that. Both are in unicode, but Python is > currently restricted to a sparse set of ascii symbols. -- Terry Jan Reedy From tjreedy at udel.edu Sat Oct 8 03:11:49 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 07 Oct 2011 21:11:49 -0400 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> Message-ID: On 10/7/2011 3:18 PM, Jim Jewett wrote: > On Thu, Oct 6, 2011 at 6:04 PM, Terry Reedy wrote: >> On 10/6/2011 4:32 PM, Jim Jewett wrote: >>> On Wed, Oct 5, 2011 at 5:17 PM, Terry Reedy wrote: >>>> On 10/4/2011 10:21 PM, Guido van Rossum wrote: > >>>>> We also have str.index which raised an exception, but people dislike >>>>> writing try/except blocks. > >>>> ... try/except blocks are routinely used for flow control in Python >>>> ... even advocate using them over if/else (leap first) > >>> str.index is a "little" method that it is tempting to use inline, even >>> as part of a comprehension. > >> That is an argument *for* raising an exception on error. > > Only for something that is truly an unexpected error. Bad or missing > data should not prevent the program from processing what it can. There is nothing specific to finding the index of a substring in a string in the above statement. If one has a collection of strings, some of which represent ints and some not, the failure of int(item) on some of them is not unexpected. > When I want an inline catch, it always meets the following criteria: Please define 'inline catch' and show how to do it with str.find 'inline' without calling the function twice. > (a) The "exception" is actually expected, at least occasionally. > (b) The exception is caused by (bad/missing/irrelevant...) input -- > nothing is wrong with my computational environment. > (c) I do NOT need extra user input; I already know what to do with it. > > Typically, I just filter it out, though I may replace it with a > placeholder and/or echo it to another output stream instead. > > (d) The algorithm SHOULD continue to process the remaining (mostly good) data. Again, nothing specific to why finding a substring index should be rather unique in having near-duplicate functions, one of which is clearly badly designed by using an inappropriate unix/c-ism. -- Terry Jan Reedy From steve at pearwood.info Sat Oct 8 03:45:44 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 08 Oct 2011 12:45:44 +1100 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> Message-ID: <4E8FAB48.6080500@pearwood.info> Terry Reedy wrote: > On 10/4/2011 10:21 PM, Guido van Rossum wrote: > >> But with str.find, 0 is a legitimate result, so if we were to return >> None there'd be *two* outcomes mapping to false: no match, or a match >> at the start of the string, which is no good. > > People would have to test that the result 'is None' or 'is not None'. > That is no worse than testing '== -1' or '>= 0'. I claim it is better > because continuing to compute with None as if it were a number will more > likely quickly raise an error, whereas doing so with a '-1' that looks > like a legitimate string position (the last), but does not really mean > that, might never raise an error but lead to erroneous output. (I said > 'more likely' because None is valid in slicings, same as -1.) Agreed. But... [...] >> I'm out of ideas here. But of all these, str.find is probably still >> the worst -- I've flagged bugs caused by it too many times to count. > > So lets deprecate it for eventual removal, maybe in Py4. Let's not. Although I stand by my earlier claim that "raise an exception" is the Obvious Way to deal with error conditions in Python, for some reason that logic doesn't seem to apply to searching. Perhaps because "not found" is not an error, it just feels uncomfortable, to many people. Whenever I use list.index, I always find myself missing list.find. Perhaps it is because using try...except requires more effort. It just feels wrong to write (for example): try: n = s.index('spam') except ValueError: pass else: s = s[n:] instead of: n = s.find('spam') if n >= 0: s = s[n:] This is especially a factor when using the interactive interpreter. (I also wonder about the performance hit of catching an exception vs. testing the return code. In a tight loop, catching the exceptions may be costly.) I don't think there is any perfect solution here, but allowing people the choice between index and find seems like a good plan to me. Using -1 as the not-found sentinel seems to be a mistake though, None would have been better. That None is valid in slices is actually a point in it's favour for at least two use-cases: # extract everything after the substring (inclusive) # or the entire string if not found n = s.find('spam') substr = s[n:] # extract everything before the substring (exclusive) # or the entire string if not found n = s.find('spam') substr = s[:n] There are other cases, of course, but using None instead of -1 will generally give you an exception pretty quickly instead of silently doing the wrong thing. -- Steven From steve at pearwood.info Sat Oct 8 04:13:58 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 08 Oct 2011 13:13:58 +1100 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: <4E8F97F6.6000709@pearwood.info> Message-ID: <4E8FB1E6.8010906@pearwood.info> Guido van Rossum wrote: > On Fri, Oct 7, 2011 at 5:23 PM, Steven D'Aprano wrote: >> (2) Cartesian product is potentially very expensive. The Cartesian product >> of a moderate-sized set and another moderate-sized set could turn out to be >> a HUGE set. >> >> This is not a fatal objection, since other operations in Python are >> potentially expensive: >> >> alist*10000000 >> >> but at least it looks expensive. You're multiplying by a big number, of >> course you're going to require a lot of memory. But set multiplication can >> very easily creep up on you: >> >> aset*bset >> >> will have size len(aset)*len(bset) which may be huge even if neither set on >> their own is. Better to keep it as a lazy iterator rather than try to >> generate a potentially huge set in one go. > > I'm not defending the Cartesian product proposal, but this argument is > just silly. What if the first example was written > > alist * n > > ? Does that look expensive? I didn't say it was a *good* argument I already acknowledged that there are expensive operations in Python, and some of them are done by operators. Perhaps I'm just over-sensitive to the risk of large Cartesian products, having locked up my desktop with a foolish list(product(seta, setb)) in exactly the circumstances I described above: both sets were moderate sizes, and it never dawned on me until my PC ground to a halt that the product would be so much bigger. (I blame myself for this error: I should know better than to carelessly pass an iterator to list without thinking, which is exactly what I did.) In my experience, most uses of list multiplication look something like this: [0]*len(arg) which is not huge except in the extreme case that arg is already huge. But the typical use of set multiplication is surely going to be something like: arg1*arg2 which may be huge even if neither of the args are. I don't think Cartesian product is important enough, or fundamental enough, to justify making it easier to inadvertently generate a huge set by mistake. That was all I tried to say. -- Steven From stephen at xemacs.org Sat Oct 8 04:30:04 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 08 Oct 2011 11:30:04 +0900 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: <87sjn4radb.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <87ipo0qkpf.fsf@uwakimon.sk.tsukuba.ac.jp> Jakob Bowyer writes: > Well it was fun watching the process that is python-ideas and the > shooting down in flames that happens here. Can someone give me some > advice about what to think about/do for the next time that an idea > comes to mind? 0. Be familiar with the Zen (both the official "python -m this" and Apocrypha, such as "not every 3-line function needs to be a builtin"). Try to see how they apply to discussions you read even when not explicitly mentioned. 1. Do check the archives, of this list and python-dev. There are some amazingly good teachers here. 2. If you're worried that the question might stupid or "obvious to the Dutch", you might float your trial balloon on python-list (aka comp.lang.python) first. 3. Make sure you know what the earlier problems with similar ideas were. At least that way you can often manage a soft landing. :-) 4. Don't let the experience stop you from trying again. There are no stupid questions -- except the unasked ones. (But see (2); maybe there's a more appropriate venue to ask the first time.) From stephen at xemacs.org Sat Oct 8 04:58:38 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 08 Oct 2011 11:58:38 +0900 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: Message-ID: <87hb3kqjdt.fsf@uwakimon.sk.tsukuba.ac.jp> Terry Reedy writes: > If the elements of A*B are sequences, then A*B is also not commutative. > But it would be if the elements were sets instead of pairs. But that's not a Cartesian product. By definition in a Cartesian product order of element components matters. I don't think I've ever seen a set product like that, and have trouble imagining applications for it unmodified (typically when squaring a set the diagonal would cause problems). > In math, *...* is a ternary operator with 3 args, like if...else in > Python or ;...: in C, but it generalizes to an n-ary operator. A better analogy is to the comma or string concatenation. I don't know if that would lead to an associative implementation, though. > In most situations, one really needs an iterator that produces the > pairs one at a time rather than a complete collection. I don't see why you couldn't have an operator on two iterables that produces an iterator. But of course comprehension notation is hard to beat for that. > And when one does want a complete collection, one might often want > it ordered as a list rather than unordered as a set. I don't understand this. Sets are unordered; any order you impose on the product would be arbitrary. So iterate the product as a set, what else might be (commonly) wanted? From jss.bulk at gmail.com Sat Oct 8 05:14:44 2011 From: jss.bulk at gmail.com (Jeffrey) Date: Fri, 07 Oct 2011 21:14:44 -0600 Subject: [Python-ideas] PEP 3101 (Advanced string formatting) base 36 integer presentation type Message-ID: <4E8FC024.9000009@gmail.com> I would like to suggest adding an integer presentation type for base 36 to PEP 3101. I can't imagine that it would be a whole lot more difficult than the existing types. Python's built-in long integers provide a nice way to prototype and demonstrate cryptographic operations, especially with asymmetric cryptography. (Alice and Bob stuff.) Built-in functions provide modular reduction, modular exponentiation, and lots of nice number theory stuff that supports a variety of protocols and algorithms. A frequent need is to represent a message by a number. Base 36 provides a way to represent all 26 letters in a semi-standard way, and simple string transformations can efficiently make zeros into spaces or vice versa. long() can already take a radix argument of 36 to parse base 36 numbers. If a base 36 presentation type (say, 'z' or something) was available, it would be possible to base 36 numbers as a simple way to interpret a message as a number in number theoretic/asymmetric cryptographic applications. The standard answer (see, for example, the Wikipedia Base 36 article) is to code up a quick routine to loop around and generate each digit, but: - The kinds of numbers used in cryptographic applications are big, as in thousands of digits. Providing some sort of direct support would nicely facilitate computational efficiency as it would avoid going around an interpreted loop thousands of times for each conversion. - All the the examples I've seen on in a quick Google search use the horrible anti-idiom of building up a string by repeated concatenation. Obviously, doing it efficiently is not trivial. It would be nice to get it right and be done with it. - Yes, one could write his own routine to do it, but this would serve as an unnecessary distraction where the purpose of the code is to illustrate a cryptographic algorithm or protocol - Yes, one could write his own module, define his own type, and override everything, but it's a fair amount of work to cover everything off (longs are a part of the language's core syntax) and even then, you still have the clutter of manually invoking constructors to make your new type instead of the built-int type. It's also kind of a trivial thing to justify promulgating a standard module for, and without a standard module, it's one more preliminary that has to be done before you can get to the main algorithm. - "Batteries included" is kind of a nice philosophy. Is it practical to add a base36 integer presentation type (say, 'z' or 'Z' similar to hexadecimal's 'x' or 'X') to the existing integer presentation types list in PEP 3101 (or do I need to raise a brand new PEP for this near-triviality)? It would be a very parallel thing to the hexadecimal one which is doing almost all the same things. I can't imagine it breaking anything because it defines something new that previously wasn't defined as anything useful at all. Jeffrey > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pyideas at rebertia.com Sat Oct 8 05:36:54 2011 From: pyideas at rebertia.com (Chris Rebert) Date: Fri, 7 Oct 2011 20:36:54 -0700 Subject: [Python-ideas] PEP 3101 (Advanced string formatting) base 36 integer presentation type In-Reply-To: <4E8FC024.9000009@gmail.com> References: <4E8FC024.9000009@gmail.com> Message-ID: On Fri, Oct 7, 2011 at 8:14 PM, Jeffrey wrote: > I would like to suggest adding an integer presentation type for base 36 to > PEP 3101.? I can't imagine that it would be a whole lot more difficult than > the existing types. > Is it practical to add a base36 integer presentation type (say, 'z' or 'Z' > similar to hexadecimal's 'x' or 'X') to the existing integer presentation > types list in PEP 3101 (or do I need to raise a brand new PEP for this > near-triviality)? ?It would be a very parallel thing to the hexadecimal one > which is doing almost all the same things. Related past discussions, albeit focused on arbitrary-radix solutions: http://mail.python.org/pipermail/python-ideas/2009-August/005611.html http://mail.python.org/pipermail/python-ideas/2009-September/005727.html http://mail.python.org/pipermail/python-dev/2006-January/059789.html http://bugs.python.org/issue6783 Cheers, Chris -- http://rebertia.com From ron3200 at gmail.com Sat Oct 8 05:52:28 2011 From: ron3200 at gmail.com (Ron Adam) Date: Fri, 07 Oct 2011 22:52:28 -0500 Subject: [Python-ideas] Default return values to int and float In-Reply-To: References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> Message-ID: <1318045948.8799.27.camel@Gutsy> On Fri, 2011-10-07 at 15:18 -0400, Jim Jewett wrote: > On Thu, Oct 6, 2011 at 6:04 PM, Terry Reedy wrote: > > On 10/6/2011 4:32 PM, Jim Jewett wrote: > >> On Wed, Oct 5, 2011 at 5:17 PM, Terry Reedy > wrote: > >>> On 10/4/2011 10:21 PM, Guido van Rossum wrote: > > >>>> We also have str.index which raised an exception, but people > dislike > >>>> writing try/except blocks. > > >>> ... try/except blocks are routinely used for flow control in > Python > >>> ... even advocate using them over if/else (leap first) > > >> str.index is a "little" method that it is tempting to use inline, > even > >> as part of a comprehension. > > > That is an argument *for* raising an exception on error. > > Only for something that is truly an unexpected error. Bad or missing > data should not prevent the program from processing what it can. > > When I want an inline catch, it always meets the following criteria: > > (a) The "exception" is actually expected, at least occasionally. Sometime I feel exceptions are overly general. Ok, so I got a ValueError exception from some block of code... But is it the one I expected, or is it one from a routine in a library I imported and wasn't caught or handled correctly. (ie.. my routine called a function in another module someone else wrote.) One answer to that is to put the try except around the fewest lines of code possible so that it doesn't catch exceptions that aren't related to some specific condition. That leads to possibly quite a few more try-except blocks, and possibly more nested try-except blocks. At some point, it may start to seem like it's a better idea to avoid them rather than use them. What if you can catch an exception specifically from a particular function or method, but let other unexpected "like" exceptions bubble through... try: ... i = s.index('bar') ... except ValueError from s.index as exc: In this case, the only ValueError the except will catch is one originating in s.index. So instead of creating more exception types to handle ever increasing circumstances, we increase the ability to detect them depending on the context. So then I can put a larger block of code inside a try-except and put as many excepts on after the try block to detect various exceptions of the same type, (or different types), raised from possibly different sub parts within that block of code. And if need be, let them bubble out, or handle them. Just a thought... Cheers, Ron From greg.ewing at canterbury.ac.nz Sat Oct 8 05:58:28 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 08 Oct 2011 16:58:28 +1300 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: Message-ID: <4E8FCA64.3080107@canterbury.ac.nz> Nick Coghlan wrote: > I expect what you actually intended here was: > > cp = product(seta, setb|setc, setd&sete) > > If you actually did want two distinct cartesian products, then the two > line version is significantly easier to read: > > cp_a = product(setb|setc, setd&sete) > cp_b = product(seta, cp_a) This is another area where the comprehension syntax is very useful. It lets you specify exactly what structure you want for the results in a very natural and readable way. -- Greg From g.brandl at gmx.net Sat Oct 8 09:42:46 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 08 Oct 2011 09:42:46 +0200 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: Message-ID: Am 08.10.2011 02:57, schrieb Terry Reedy: > On 10/7/2011 7:20 AM, Paul Moore wrote: >> On 7 October 2011 11:37, Jakob Bowyer wrote: >>> There is that but from a math point of view the syntax a * b does make >>> sence. >>> Its slightly clearer and makes more sense to people from outside of a >>> programming background. > > Math used a different symbol, an 'X' without serifs, for cross-products. > The result is a set of 'ordered pairs', which is different from a duple. While I understand the rest of your post, this made me wonder: what is the difference between an ordered pair and a 2-tuple? Georg From aquavitae69 at gmail.com Sat Oct 8 12:28:21 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Sat, 8 Oct 2011 12:28:21 +0200 Subject: [Python-ideas] Default return values to int and float In-Reply-To: <1318045948.8799.27.camel@Gutsy> References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> <1318045948.8799.27.camel@Gutsy> Message-ID: Many of these issues might be solved by providing a one line alternative to the rather unwieldy try statement: try: return function() except ValueError: return default I can't settle on a good syntax though. Two suggestions return function() except(ValueError) default return default if except(ValueError) else function() The second is more readable, but seems a bit backwards, like it's handling the exception before it occurs. Is this idea worth pursuing if we can find the right syntax? On Oct 8, 2011 5:52 AM, "Ron Adam" wrote: > On Fri, 2011-10-07 at 15:18 -0400, Jim Jewett wrote: > > On Thu, Oct 6, 2011 at 6:04 PM, Terry Reedy wrote: > > > On 10/6/2011 4:32 PM, Jim Jewett wrote: > > >> On Wed, Oct 5, 2011 at 5:17 PM, Terry Reedy > > wrote: > > >>> On 10/4/2011 10:21 PM, Guido van Rossum wrote: > > > > >>>> We also have str.index which raised an exception, but people > > dislike > > >>>> writing try/except blocks. > > > > >>> ... try/except blocks are routinely used for flow control in > > Python > > >>> ... even advocate using them over if/else (leap first) > > > > >> str.index is a "little" method that it is tempting to use inline, > > even > > >> as part of a comprehension. > > > > > That is an argument *for* raising an exception on error. > > > > Only for something that is truly an unexpected error. Bad or missing > > data should not prevent the program from processing what it can. > > > > When I want an inline catch, it always meets the following criteria: > > > > (a) The "exception" is actually expected, at least occasionally. > > Sometime I feel exceptions are overly general. Ok, so I got a > ValueError exception from some block of code... But is it the one I > expected, or is it one from a routine in a library I imported and wasn't > caught or handled correctly. (ie.. my routine called a function in > another module someone else wrote.) > > One answer to that is to put the try except around the fewest lines of > code possible so that it doesn't catch exceptions that aren't related > to some specific condition. That leads to possibly quite a few more > try-except blocks, and possibly more nested try-except blocks. At some > point, it may start to seem like it's a better idea to avoid them rather > than use them. > > What if you can catch an exception specifically from a particular > function or method, but let other unexpected "like" exceptions bubble > through... > > try: > ... > i = s.index('bar') > ... > except ValueError from s.index as exc: > > > In this case, the only ValueError the except will catch is one > originating in s.index. > > So instead of creating more exception types to handle ever increasing > circumstances, we increase the ability to detect them depending on the > context. > > So then I can put a larger block of code inside a try-except and put as > many excepts on after the try block to detect various exceptions of the > same type, (or different types), raised from possibly different sub > parts within that block of code. And if need be, let them bubble out, > or handle them. > > Just a thought... > > Cheers, > Ron > > > > > > > > > > > > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Oct 8 13:23:01 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 08 Oct 2011 22:23:01 +1100 Subject: [Python-ideas] Default return values to int and float In-Reply-To: <1318045948.8799.27.camel@Gutsy> References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> <1318045948.8799.27.camel@Gutsy> Message-ID: <4E903295.2010800@pearwood.info> Ron Adam wrote: > Sometime I feel exceptions are overly general. Ok, so I got a > ValueError exception from some block of code... But is it the one I > expected, or is it one from a routine in a library I imported and wasn't > caught or handled correctly. (ie.. my routine called a function in > another module someone else wrote.) I can't say I've very often cared where the exception comes from. But that's because I generally wrap the smallest amount of code possible in an try...except block, so there's only a limited number of places it could come from. > One answer to that is to put the try except around the fewest lines of > code possible so that it doesn't catch exceptions that aren't related > to some specific condition. Exactly. > That leads to possibly quite a few more > try-except blocks, and possibly more nested try-except blocks. At some > point, it may start to seem like it's a better idea to avoid them rather > than use them. Or refactor parts of your code into a function. > What if you can catch an exception specifically from a particular > function or method, but let other unexpected "like" exceptions bubble > through... > > try: > ... > i = s.index('bar') > ... > except ValueError from s.index as exc: > I can't imagine that this would even be *possible*, but even if it is, I would say it certainly isn't *desirable*. (1) You're repeating code you expect to fail: you write s.index twice, even though it only gets called once. (2) The semantics are messy and unclear. Suppose you have this: try: ... i = s.index(a) j = s.index(b) + s.index(c) t = s k = t.index(d) method = s.index l = method(e) ... except ValueError from s.index: ... Which potential s.index exceptions will get caught? All of them? Some of them? Only i and j? What if you want to catch only some but not others? How will this construct apply if s or s.index is rebound, or deleted, inside the try block? s += "spam" m = s.index(f) What about this? alist = ['spam', 'ham', s, 'eggs'] results = [x.index('cheese') for x in alist] Should your proposal catch an exception in the list comp? What if you call a function which happens to call s.index? Will that be caught? x = some_function(spam, ham, s, eggs) # happens to call s.index -- Steven From dmascialino at gmail.com Sat Oct 8 14:37:37 2011 From: dmascialino at gmail.com (Diego Mascialino) Date: Sat, 8 Oct 2011 09:37:37 -0300 Subject: [Python-ideas] Proporsal: Show an alert in traceback when the source file is newer than compiled code (issue8087) In-Reply-To: <20111007230425.GA10077@cskk.homeip.net> References: <20111007230425.GA10077@cskk.homeip.net> Message-ID: On Fri, Oct 7, 2011 at 8:04 PM, Cameron Simpson wrote: > > Only when the traceback was being printed. You would need to stash the > modtime at import for later reference. Actually the source file modtime there is in the compiled source (.pyc). Import use this information to know if is necessary to compile the source again. In my patch, when a traceback is going to be printed, I compare the actual modtime with the modtime stored in the pyc. Regards, Diego From solipsis at pitrou.net Sat Oct 8 15:03:36 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 8 Oct 2011 15:03:36 +0200 Subject: [Python-ideas] PEP 3101 (Advanced string formatting) base 36 integer presentation type References: <4E8FC024.9000009@gmail.com> Message-ID: <20111008150336.3839a98c@pitrou.net> On Fri, 07 Oct 2011 21:14:44 -0600 Jeffrey wrote: > I would like to suggest adding an integer presentation type for base 36 > to PEP 3101. I can't imagine that it would be a whole lot more > difficult than the existing types. Python's built-in long integers > provide a nice way to prototype and demonstrate cryptographic > operations, especially with asymmetric cryptography. (Alice and Bob > stuff.) Built-in functions provide modular reduction, modular > exponentiation, and lots of nice number theory stuff that supports a > variety of protocols and algorithms. A frequent need is to represent a > message by a number. Base 36 provides a way to represent all 26 letters > in a semi-standard way, and simple string transformations can > efficiently make zeros into spaces or vice versa. Why base 36 rather than, say, base 64 or even base 80? From ethan at stoneleaf.us Sat Oct 8 17:44:52 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 08 Oct 2011 08:44:52 -0700 Subject: [Python-ideas] PEP 3101 (Advanced string formatting) base 36 integer presentation type In-Reply-To: <20111008150336.3839a98c@pitrou.net> References: <4E8FC024.9000009@gmail.com> <20111008150336.3839a98c@pitrou.net> Message-ID: <4E906FF4.6070003@stoneleaf.us> Antoine Pitrou wrote: > On Fri, 07 Oct 2011 21:14:44 -0600 > Jeffrey wrote: >> I would like to suggest adding an integer presentation type for base 36 >> to PEP 3101. I can't imagine that it would be a whole lot more >> difficult than the existing types. Python's built-in long integers >> provide a nice way to prototype and demonstrate cryptographic >> operations, especially with asymmetric cryptography. (Alice and Bob >> stuff.) Built-in functions provide modular reduction, modular >> exponentiation, and lots of nice number theory stuff that supports a >> variety of protocols and algorithms. A frequent need is to represent a >> message by a number. Base 36 provides a way to represent all 26 letters >> in a semi-standard way, and simple string transformations can >> efficiently make zeros into spaces or vice versa. > > Why base 36 rather than, say, base 64 or even base 80? Because there are 26 ascii letters and 10 ascii digits? ~Ethan~ From solipsis at pitrou.net Sat Oct 8 17:53:12 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 8 Oct 2011 17:53:12 +0200 Subject: [Python-ideas] PEP 3101 (Advanced string formatting) base 36 integer presentation type References: <4E8FC024.9000009@gmail.com> <20111008150336.3839a98c@pitrou.net> <4E906FF4.6070003@stoneleaf.us> Message-ID: <20111008175312.4f197406@pitrou.net> On Sat, 08 Oct 2011 08:44:52 -0700 Ethan Furman wrote: > Antoine Pitrou wrote: > > On Fri, 07 Oct 2011 21:14:44 -0600 > > Jeffrey wrote: > >> I would like to suggest adding an integer presentation type for base 36 > >> to PEP 3101. I can't imagine that it would be a whole lot more > >> difficult than the existing types. Python's built-in long integers > >> provide a nice way to prototype and demonstrate cryptographic > >> operations, especially with asymmetric cryptography. (Alice and Bob > >> stuff.) Built-in functions provide modular reduction, modular > >> exponentiation, and lots of nice number theory stuff that supports a > >> variety of protocols and algorithms. A frequent need is to represent a > >> message by a number. Base 36 provides a way to represent all 26 letters > >> in a semi-standard way, and simple string transformations can > >> efficiently make zeros into spaces or vice versa. > > > > Why base 36 rather than, say, base 64 or even base 80? > > Because there are 26 ascii letters and 10 ascii digits? That's not really answering the question. Are people used to reading and manipulating numbers in base 36? If not, why not use the most compact representation? (if you are not interested in the most compact representation, then you can simply use base 10 or 16) Regards Antoine. From yoch.melka at gmail.com Sat Oct 8 19:48:48 2011 From: yoch.melka at gmail.com (yoch melka) Date: Sat, 8 Oct 2011 19:48:48 +0200 Subject: [Python-ideas] back-references in comprehensions Message-ID: Hi, I would like to use backreferences in list comprehensions (or other comprehensions), such as : [[elt for elt in lst if elt] for lst in matrix if \{1}] # \{1} is back-reference to filter result of the # first comprehension ([elt for elt in lst if elt]) It would be possible to do this ? Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: From jh at improva.dk Sat Oct 8 21:02:35 2011 From: jh at improva.dk (Jacob Holm) Date: Sat, 08 Oct 2011 21:02:35 +0200 Subject: [Python-ideas] back-references in comprehensions In-Reply-To: References: Message-ID: <4E909E4B.4030703@improva.dk> On 2011-10-08 19:48, yoch melka wrote: > I would like to use backreferences in list comprehensions (or other > comprehensions), such as : > > [[elt for elt in lst if elt] for lst in matrix if \{1}] > # \{1} is back-reference to filter result of the > # first comprehension ([elt for elt in lst if elt]) > > It would be possible to do this ? I don't think the syntax you propose is going to fly, but a similar feature has been discussed before on this list. I prefer this version: [[elt for elt in lst if elt] as r for lst in matrix if r] But I don't have much hope that it will ever get in. FWIW, the example can already be written in several ways in current python, e.g.: 1) [r for lst in matrix for r in ([elt for elt in lst if elt],) if r] 2) filter(None, ([elt for elt in lst if elt] for lst in matrix)) HTH - Jacob From zuo at chopin.edu.pl Sat Oct 8 22:17:13 2011 From: zuo at chopin.edu.pl (Jan Kaliszewski) Date: Sat, 8 Oct 2011 22:17:13 +0200 Subject: [Python-ideas] Default return values to int and float In-Reply-To: <4E903295.2010800@pearwood.info> References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> <1318045948.8799.27.camel@Gutsy> <4E903295.2010800@pearwood.info> Message-ID: <20111008201713.GA2286@chopin.edu.pl> Steven D'Aprano dixit (2011-10-08, 22:23): > Ron Adam wrote: [snip] > >What if you can catch an exception specifically from a particular > >function or method, but let other unexpected "like" exceptions bubble > >through... > > > > try: > > ... > > i = s.index('bar') > > ... > > except ValueError from s.index as exc: > > > > > I can't imagine that this would even be *possible*, but even if it > is, I would say it certainly isn't *desirable*. > > (1) You're repeating code you expect to fail: you write s.index > twice, even though it only gets called once. > > (2) The semantics are messy and unclear. Suppose you have this: > > try: > ... > i = s.index(a) > j = s.index(b) + s.index(c) > t = s > k = t.index(d) > method = s.index > l = method(e) > ... > except ValueError from s.index: > ... > > Which potential s.index exceptions will get caught? All of them? Some > of them? Only i and j? What if you want to catch only some but not > others? [snip] Maybe labeling interesting lines of code could be more suitable? E.g.: try: ... 'risky one' i = s.index('bar') ... except ValueError from 'risky one' as exc: Cheers. *j From tjreedy at udel.edu Sun Oct 9 07:20:22 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 09 Oct 2011 01:20:22 -0400 Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: Message-ID: On 10/8/2011 3:42 AM, Georg Brandl wrote: > Am 08.10.2011 02:57, schrieb Terry Reedy: >> On 10/7/2011 7:20 AM, Paul Moore wrote: >>> On 7 October 2011 11:37, Jakob Bowyer wrote: >>>> There is that but from a math point of view the syntax a * b does make >>>> sence. >>>> Its slightly clearer and makes more sense to people from outside of a >>>> programming background. >> >> Math used a different symbol, an 'X' without serifs, for cross-products. >> The result is a set of 'ordered pairs', which is different from a duple. > > While I understand the rest of your post, this made me wonder: what is the > difference between an ordered pair and a 2-tuple? I could just say that they are different but nearly synonymous words used in different fields. A more precise answer depends on the meaning chosen for 'ordered pair' and '2-tuple'. If one takes 'ordered pair' as an undefined primitive (and generic) concept, then '2-tuple' is one specialization of the concept. https://secure.wikimedia.org/wikipedia/en/wiki/Ordered_pair says "Ordered pairs are also called 2-tuples, 2-dimensional vectors, or sequences of length 2." I take that as meaning that the latter three are specializations, as 'tuple' is definitely not the same as 'vector'. If one takes 'ordered pair' as a specialized set, then they different for a different reason. Tuple is not a subclass of set, at least not in Python. In practice, the two classes often have different interfaces. The two members of ordered pairs are the first and second. They are extracted by two different functions. Lisp cons cells with car (first) and cdr (rest) functions are an example. The two members of 2-tuples are also the 0-1 or 1-2 members and are usually extracted by indexing, which is one function taking two parameters. Python duples with 0-1 indexing are an example. -- Terry Jan Reedy From ron3200 at gmail.com Sun Oct 9 08:31:07 2011 From: ron3200 at gmail.com (Ron Adam) Date: Sun, 09 Oct 2011 01:31:07 -0500 Subject: [Python-ideas] Default return values to int and float In-Reply-To: <4E903295.2010800@pearwood.info> References: <6154E94B-E55B-425F-98DD-9D1BC4DCB87D@gmail.com> <4E8B9324.5040009@pearwood.info> <1318045948.8799.27.camel@Gutsy> <4E903295.2010800@pearwood.info> Message-ID: <1318141867.12211.100.camel@Gutsy> On Sat, 2011-10-08 at 22:23 +1100, Steven D'Aprano wrote: > Ron Adam wrote: > > That leads to possibly quite a few more > > try-except blocks, and possibly more nested try-except blocks. At some > > point, it may start to seem like it's a better idea to avoid them rather > > than use them. > > Or refactor parts of your code into a function. Of course refactoring a bit of code so as to be sensitive to the context would help, but it's not always as straight forward as it seems. It's not a tool you would use everywhere. > > What if you can catch an exception specifically from a particular > > function or method, but let other unexpected "like" exceptions bubble > > through... > > > > try: > > ... > > i = s.index('bar') > > ... > > except ValueError from s.index as exc: > > > > > I can't imagine that this would even be *possible*, but even if it is, I > would say it certainly isn't *desirable*. > > (1) You're repeating code you expect to fail: you write s.index twice, > even though it only gets called once. Right, and if all you are interested in is just that, you would just wrap that part in regular try except and not do it this way. > (2) The semantics are messy and unclear. Suppose you have this: > > try: > ... > i = s.index(a) > j = s.index(b) + s.index(c) > t = s > k = t.index(d) > method = s.index > l = method(e) > ... > except ValueError from s.index: > ... > > Which potential s.index exceptions will get caught? All of them? Some of > them? Only i and j? What if you want to catch only some but not others? Any of them you put inside the try-except block. But it would not catch a ValueError caused by some other function or method in the '...' part of the example. > How will this construct apply if s or s.index is rebound, or deleted, > inside the try block? > > s += "spam" > m = s.index(f) The rebinding of the name doesn't matter as it is an object comparison. It may work more like... try: ... s += "spam" m = s.index(f) ... except ValueError as e: if e.__cause__ is str.index: ... raise > What about this? > > alist = ['spam', 'ham', s, 'eggs'] > results = [x.index('cheese') for x in alist] > > Should your proposal catch an exception in the list comp? Sure, why not? > What if you call a function which happens to call s.index? Will that be > caught? > > x = some_function(spam, ham, s, eggs) # happens to call s.index The str.index method would be the source of the exception unless some_function() catches it. It could raise a new exception and then it would be reported as the cause. Cheers, Ron From victor.stinner at haypocalc.com Sun Oct 9 11:50:35 2011 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Sun, 09 Oct 2011 11:50:35 +0200 Subject: [Python-ideas] [Python-Dev] PEP 3101 (Advanced string formatting) base 36 integer presentation type In-Reply-To: <4E9068EF.3050800@haypocalc.com> References: <4E8FC024.9000009@gmail.com> <20111008150336.3839a98c@pitrou.net> <4E9068EF.3050800@haypocalc.com> Message-ID: <4E916E6B.3080506@haypocalc.com> Le 08/10/2011 17:14, Victor Stinner a ?crit : > Le 08/10/2011 15:03, Antoine Pitrou a ?crit : >> On Fri, 07 Oct 2011 21:14:44 -0600 >> Jeffrey wrote: >>> I would like to suggest adding an integer presentation type for base 36 >>> to PEP 3101. I can't imagine that it would be a whole lot more >>> difficult than the existing types. Python's built-in long integers >>> provide a nice way to prototype and demonstrate cryptographic >>> operations, especially with asymmetric cryptography. (Alice and Bob >>> stuff.) Built-in functions provide modular reduction, modular >>> exponentiation, and lots of nice number theory stuff that supports a >>> variety of protocols and algorithms. A frequent need is to represent a >>> message by a number. Base 36 provides a way to represent all 26 letters >>> in a semi-standard way, and simple string transformations can >>> efficiently make zeros into spaces or vice versa. >> >> Why base 36 rather than, say, base 64 or even base 80? > > Base 85 is the most efficient base to format IPv6 addresses! > > http://tools.ietf.org/html/rfc1924 > > And Python doesn't provide builtin function for this base! > > Victor Oops, I answered to the wrong mailing list. Victor From karthick.sankarachary at gmail.com Sun Oct 9 22:25:16 2011 From: karthick.sankarachary at gmail.com (Karthick Sankarachary) Date: Sun, 9 Oct 2011 13:25:16 -0700 Subject: [Python-ideas] Testing Key-Value Membership In Dictionaries Message-ID: Hello Python Ideas, Currently, to check whether a single key is in a dictionary, we use the "in" keyword. However, there is no built-in support for checking if a key-value pair belongs in a dictionary. Currently, we presuppose that the object being checked has the same type as that of the key. What if we allowed the "in" operator to accept a tuple that denotes a (mapped) key-value pair? Let us consider how that might work using the canonical example given in the tutorial: >>> tel = {'jack': 4098, 'sape': 4139} >>> ('jack', 4098) in tel True >>> ('jack', 4000) in tel False >>> 'jack' in tel True As you can see, the "in" operator would interpret the object as either a key or a key-value pair depending on the actual types of the object, key and value. In the key itself happens to be a tuple, then the key-value membership test would involve a nested tuple, whose first item is a tuple denoting the key. Best Regards, Karthick Sankarachary -------------- next part -------------- An HTML attachment was scrubbed... URL: From jxo6948 at rit.edu Sun Oct 9 22:54:16 2011 From: jxo6948 at rit.edu (John O'Connor) Date: Sun, 9 Oct 2011 16:54:16 -0400 Subject: [Python-ideas] Testing Key-Value Membership In Dictionaries In-Reply-To: References: Message-ID: You can do the same check using a default argument in dict.get such as... >>> tel.get('jack', None) == 4098 True >>> tel.get('jack', None) == 4000 False >>> tel.get('jill', None) == 4000 False - John On Sun, Oct 9, 2011 at 4:25 PM, Karthick Sankarachary < karthick.sankarachary at gmail.com> wrote: > Hello Python Ideas, > > Currently, to check whether a single key is in a dictionary, we use the "in" > keyword. However, there is no built-in support for checking if a key-value > pair belongs in a dictionary. > > Currently, we presuppose that the object being checked has the same type as > that of the key. What if we allowed the "in" operator to accept a tuple that > denotes a (mapped) key-value pair? > > Let us consider how that might work using the canonical example given in the > tutorial: > >>>> tel = {'jack': 4098, 'sape': 4139} > > >>>> ('jack', 4098) in tel > True >>>> ('jack', 4000) in tel > False >>>> 'jack' in tel > True > > As you can see, the "in" operator would interpret the object as either a key > or a key-value pair depending on the actual types of the object, key and > value. In the key itself happens to be a tuple, then the key-value > membership test would involve a nested tuple, whose first item is a tuple > denoting the key. > > Best Regards, > Karthick Sankarachary > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From karthick.sankarachary at gmail.com Sun Oct 9 22:57:16 2011 From: karthick.sankarachary at gmail.com (Karthick Sankarachary) Date: Sun, 9 Oct 2011 13:57:16 -0700 Subject: [Python-ideas] Tuple Comprehensions Message-ID: Hello Python Ideas, To quote the tutorial, "List comprehensions provide a concise way to create lists without resorting to use of map(), filter() and/or lambda.". There are a couple of constraints in the above definition that I feel could be overcome. For one thing, list comprehensions don't let you create immutable sequences (such as a tuple) by design. For another, it does not provide a concise alternative to reduce() function. To address both of the above, I'd like to introduce tuple comprehensions, which would work like so: >>> freshfruit = [' banana', ' loganberry ', 'passion fruit '] >>> (weapon.strip() for weapon in freshfruit) ('banana', 'loganberry', 'passion fruit') >>> import operator >>> (operator.concat(_, weapon.strip()) for weapon in freshfruit) 'bananaloganberrypassion fruit' As you can see, we use parenthesis instead of square brackets around the comprehension. In the first tuple comprehension, we create a true tuple with multiple items. This might a tad more efficient, not to mention less verbose, than applying the "tuple" function on top of a list comprehension. In the second tuple comprehension, we use a reduce() function (specifically operator.concat) to concatenate all of the fruit names. In particular, we use the "_" variable (for lack of a better name) to track the running outcome of the reduce() function. Best Regards, Karthick Sankarachary -------------- next part -------------- An HTML attachment was scrubbed... URL: From sven at marnach.net Sun Oct 9 23:06:32 2011 From: sven at marnach.net (Sven Marnach) Date: Sun, 9 Oct 2011 22:06:32 +0100 Subject: [Python-ideas] Tuple Comprehensions In-Reply-To: References: Message-ID: <20111009210632.GB9230@pantoffel-wg.de> Karthick Sankarachary schrieb am So, 09. Okt 2011, um 13:57:16 -0700: > >>> (weapon.strip() for weapon in freshfruit) > ('banana', 'loganberry', 'passion fruit') This syntax is already taken for generator expressions. You can construct a tuple from the generator expression using tuple(weapon.strip() for weapon in freshfruit) Cheers, Sven From ben+python at benfinney.id.au Sun Oct 9 23:21:19 2011 From: ben+python at benfinney.id.au (Ben Finney) Date: Mon, 10 Oct 2011 08:21:19 +1100 Subject: [Python-ideas] Tuple Comprehensions References: Message-ID: <8762jxzws0.fsf@benfinney.id.au> Karthick Sankarachary writes: > To address both of the above, I'd like to introduce tuple > comprehensions, which would work like so: > > >>> freshfruit = [' banana', ' loganberry ', 'passion fruit '] > >>> (weapon.strip() for weapon in freshfruit) > ('banana', 'loganberry', 'passion fruit') > >>> import operator > >>> (operator.concat(_, weapon.strip()) for weapon in freshfruit) > 'bananaloganberrypassion fruit' > > As you can see, we use parenthesis instead of square brackets around > the comprehension. That's already valid syntax. The parentheses are used around generator expressions in existing code. Since that's the case, why not just: tuple(weapon.strip() for weapon in freshfruit) if you want the generator expression turned into a tuple? -- \ ?When people believe that they have absolute knowledge, with no | `\ test in reality, this [the Auschwitz crematorium] is how they | _o__) behave.? ?Jacob Bronowski, _The Ascent of Man_, 1973 | Ben Finney From jh at improva.dk Sun Oct 9 23:21:21 2011 From: jh at improva.dk (Jacob Holm) Date: Sun, 09 Oct 2011 23:21:21 +0200 Subject: [Python-ideas] Testing Key-Value Membership In Dictionaries In-Reply-To: References: Message-ID: <4E921051.1040606@improva.dk> Hi On 2011-10-09 22:25, Karthick Sankarachary wrote: > Currently, to check whether a single key is in a dictionary, we use the "in" > keyword. However, there is no built-in support for checking if a key-value > pair belongs in a dictionary. Yes there is. You use the "viewitems()" method in 2.7 or the "items()" method in 3.x to get a set-like view on the dictionary, then test membership of that. So for your example: >>> tel = {'jack': 4098, 'sape': 4139} >>> ('jack', 4098) in tel.items() # use viewitems() in 2.7 True >>> ('jack', 4000) in tel.items() # use viewitems() in 2.7 False >>> 'jack' in tel True The above code *works* in all versions of python (AFAIK), but for large dictionaries it can be quite inefficient in all versions before 3.0. >From 3.0 forward this is the "one obvious way to do it", and is fast no matter the size of the dictionary (expected O(1) time). HTH - Jacob From sven at marnach.net Sun Oct 9 23:03:02 2011 From: sven at marnach.net (Sven Marnach) Date: Sun, 9 Oct 2011 22:03:02 +0100 Subject: [Python-ideas] Testing Key-Value Membership In Dictionaries In-Reply-To: References: Message-ID: <20111009210302.GA9230@pantoffel-wg.de> Karthick Sankarachary schrieb am So, 09. Okt 2011, um 13:25:16 -0700: > >>> tel = {'jack': 4098, 'sape': 4139} > > > >>> ('jack', 4098) in tel > True > >>> ('jack', 4000) in tel > False > >>> 'jack' in tel > True These semantics are inconsistent: >>> d = {"a": 1, ("a", 2): 3} >>> ("a", 2) in d What result do you expect now? False, because the value for "a" is 1, or True, because ("a", 2) occurs as a key? Neither answer seems to be useful. Moreover, the functionality you want is easily available as something like tel.get("jack") == "4098" Cheers, Sven From tjreedy at udel.edu Mon Oct 10 00:19:04 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 09 Oct 2011 18:19:04 -0400 Subject: [Python-ideas] Tuple Comprehensions In-Reply-To: References: Message-ID: On 10/9/2011 4:57 PM, Karthick Sankarachary wrote: > There are a couple of constraints in the above definition that I feel > could be overcome. For one thing, list comprehensions don't let you > create immutable sequences (such as a tuple) by design. To create an 'immutable' collection, one has to have all the items present from the beginning, along with their count. If the items are generated one at a time, they must be collected in a temporary mutable internal structure. So '(' comprehension ')' syntax like >>>> (weapon.strip() for weapon in freshfruit) was used to define and call an anonymous generator function, thus resulting in a generator. This is more flexible than making the result a tuple and in line with the Python 3 shift of emphasis toward iterators. > In the first tuple comprehension, we create a true tuple with multiple > items. This might a tad more efficient, not to mention less verbose, > than applying the "tuple" function on top of a list comprehension. tuple(weapon.strip() for weapon in freshfruit) does little more than would have to be done anyway to produce a tuple. It is also a fairly rare need. A comprehension by its nature produces a homogeneous sequence as all items are produced from the same expression. In real use, tuples tend to be short and often heterogeneous. -- Terry Jan Reedy From tjreedy at udel.edu Mon Oct 10 00:29:39 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 09 Oct 2011 18:29:39 -0400 Subject: [Python-ideas] Testing Key-Value Membership In Dictionaries In-Reply-To: References: Message-ID: > Currently, to check whether a single key is in a dictionary, we use the > "in" keyword. However, there is no built-in support for checking if a > key-value pair belongs in a dictionary. Python is developed according to practical need rather than theoretical consistency or symmetry. In Python 3, 'key in dict' is the same as 'key in dict.keys()'. Similarly, 'for item in dict' is the same as 'for item in dict.keys()'. One can look for or iterate over key-value items by replacing 'keys' with 'items' in either construction above. Dict keys get the special treatment of being the default because they are the most commonly used for search and iteration. -- Terry Jan Reedy From steve at pearwood.info Mon Oct 10 03:25:23 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 10 Oct 2011 12:25:23 +1100 Subject: [Python-ideas] Testing Key-Value Membership In Dictionaries In-Reply-To: References: Message-ID: <4E924983.8070905@pearwood.info> Karthick Sankarachary wrote: > Hello Python Ideas, > > Currently, to check whether a single key is in a dictionary, we use the "in" > keyword. However, there is no built-in support for checking if a key-value > pair belongs in a dictionary. Of course there is. (key, value) in d.items() is explicit and simple. Or if you prefer to avoid an O(N) search through a list, the following is only a tiny bit less convenient, but it has the advantage of being effectively O(1): key in d and d[key] == value Fast, efficient, simple, obvious, explicit and avoids "magic" behaviour. > Currently, we presuppose that the object being checked has the same type as > that of the key. No we don't. d = {1: 2} "spam" in d works perfectly. There's no need to presuppose that the key being tested for has the same type as the actual keys in the dict. The only thing we suppose is that the given key is hashable. Other than the assumption of hashability, there are no assumptions made about the keys. You want to change that, by assuming that keys aren't tuples. > What if we allowed the "in" operator to accept a tuple that > denotes a (mapped) key-value pair? Having a single function (or in this case, operator) perform actions with different semantics depending on the value of the argument is rarely a good idea, especially not for something as basic and fundamental as containment tests. Let me give you an analogy: suppose somebody proposed that `substring in string` should do something different depending on whether substring was made up of digits or not: "x" in "a12-50" => returns False, simple substring test "2" in "a12-50" => returns False, numeric test 2 in the range 12-50? "20" in "a12-50" => returns True, because 20 is in the range 12-50 And then they propose a hack to avoid the magic test and fall back on the standard substring test: "^20" in "a12-50" => returns False, simple substring test "^20" in "a12034" => returns True and a way to escape the magic hack: "^^20" in "a12034" => returns False "^^20" in "a1^2034" => returns True When you understand why this is a terrible idea, you will understand why overloading the in operator to magically decide whether you want a key test or a key/value test is also a terrible idea. The fact that in your proposal, you are checking the class of the argument, but in mine I am checking the value of the argument, is irrelevant. One of the problems with the proposal is that in the event that the argument is not a literal, you can't tell what the code will do: x in some_dict With your proposal, you simply can't tell: it might test for a key, or it might test for a key/value test, and you can't tell which until runtime when x has it's value. But normally you want one or the other: the two tests have different meanings: you either want to test for a key, or for a key/value. I can't think of any case where you don't care which you get. So to write a function that includes a test for a key, you are forced to write complicated code with an inconvenient type-check: def function(d, key): # I only want to test for a key if ((key,) in d if isinstance(key, tuple) else key in d): print("key detected") To say nothing of the code that you will break -- this is a major backwards incompatible change, changing the semantics of existing code. Even if it were a good idea, it would be a bad idea. -- Steven From grosser.meister.morti at gmx.net Mon Oct 10 04:51:01 2011 From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=) Date: Mon, 10 Oct 2011 04:51:01 +0200 Subject: [Python-ideas] Tuple Comprehensions In-Reply-To: References: Message-ID: <4E925D95.5040402@gmx.net> On 10/09/2011 10:57 PM, Karthick Sankarachary wrote: > Hello Python Ideas, > > To quote the tutorial, "List comprehensions provide a concise way to create lists without resorting > to use of map(), filter() and/or lambda.". > > There are a couple of constraints in the above definition that I feel could be overcome. For one > thing, list comprehensions don't let you create immutable sequences (such as a tuple) by design. For > another, it does not provide a concise alternative to reduce() function. > > To address both of the above, I'd like to introduce tuple comprehensions, which would work like so: > >>>> freshfruit = [' banana', ' loganberry', 'passion fruit'] > > >>>> (weapon.strip() for weapon in freshfruit) > > > ('banana','loganberry','passion fruit') >>>> import operator >>>> (operator.concat(_, weapon.strip()) for weapon in freshfruit) > 'bananaloganberrypassion fruit' > In addition to all previous remarks about the generator expression, what's wrong with the following? >>> import operator >>> xs=['banana', 'loganberry', 'passion fruit'] >>> reduce(operator.concat,(x.strip() for x in xs)) > > > As you can see, we use parenthesis instead of square brackets around the comprehension. > > In the first tuple comprehension, we create a true tuple with multiple items. This might a tad more > efficient, not to mention less verbose, than applying the "tuple" function on top of a list > comprehension. > > In the second tuple comprehension, we use a reduce() function (specifically operator.concat) to > concatenate all of the fruit names. In particular, we use the "_" variable (for lack of a better > name) to track the running outcome of the reduce() function. > > Best Regards, > Karthick Sankarachary > > From masklinn at masklinn.net Mon Oct 10 08:07:32 2011 From: masklinn at masklinn.net (Masklinn) Date: Mon, 10 Oct 2011 08:07:32 +0200 Subject: [Python-ideas] Tuple Comprehensions In-Reply-To: <4E925D95.5040402@gmx.net> References: <4E925D95.5040402@gmx.net> Message-ID: <22831DA1-CF31-4245-96CF-90FD360B33FA@masklinn.net> On 2011-10-10, at 04:51 , Mathias Panzenb?ck wrote: > >> ('banana','loganberry','passion fruit') >>>>> import operator >>>>> (operator.concat(_, weapon.strip()) for weapon in freshfruit) >> 'bananaloganberrypassion fruit' >> > > In addition to all previous remarks about the generator expression, what's wrong with the following? > > >>> import operator > >>> xs=['banana', 'loganberry', 'passion fruit'] > >>> reduce(operator.concat,(x.strip() for x in xs)) I'm guessing the issues (from the original mail) are that it does not benefit from the "concision" ascribed to comprehension (though that concision really is mostly when there isn't a mapping and/or filtering method readily available), and that `reduce` moved to `functools` in Python 3 And of course, for this precise expression you could just do this: >>> ''.join(x.strip() for x in xs) From wuwei23 at gmail.com Mon Oct 10 08:23:12 2011 From: wuwei23 at gmail.com (alex23) Date: Sun, 9 Oct 2011 23:23:12 -0700 (PDT) Subject: [Python-ideas] Support multiplication for sets In-Reply-To: References: <87sjn4radb.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <2169ec9e-5782-4b39-90e7-ad8813c42712@y12g2000prh.googlegroups.com> On Oct 8, 6:54?am, Jakob Bowyer wrote: > Well it was fun watching the process that is python-ideas and the shooting > down in flames that happens here. Can someone give me some advice about what > to think about/do for the next time that an idea comes to mind? An implementation is always helpful. It doesn't need to be complete but it will certainly help you iron out the idea and give everyone else something concrete to discuss. If it's possible/relevant to the suggestion, putting it up as a package on PyPI is also a good way to gauge usefulness. From ncoghlan at gmail.com Mon Oct 10 09:54:06 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 10 Oct 2011 17:54:06 +1000 Subject: [Python-ideas] Proporsal: Show an alert in traceback when the source file is newer than compiled code (issue8087) In-Reply-To: References: <20111007230425.GA10077@cskk.homeip.net> Message-ID: On Sat, Oct 8, 2011 at 10:37 PM, Diego Mascialino wrote: > On Fri, Oct 7, 2011 at 8:04 PM, Cameron Simpson wrote: >> >> Only when the traceback was being printed. You would need to stash the >> modtime at import for later reference. > > > Actually the source file modtime there is in the compiled source (.pyc). > Import use this information to know if is necessary to compile the source again. > > In my patch, when a traceback is going to be printed, I compare the > actual modtime > with the modtime stored in the pyc. Having been caught by this myself on occasion, warning in the traceback certainly sounds like a reasonable hint to provide. I posted a comment to the tracker to that effect (also noting that even with this hint, reloading still has its own problems with stale references to previous incarnation of the module). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From greg.ewing at canterbury.ac.nz Mon Oct 10 23:26:15 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 11 Oct 2011 10:26:15 +1300 Subject: [Python-ideas] Tuple Comprehensions In-Reply-To: References: Message-ID: <4E9362F7.9020105@canterbury.ac.nz> Karthick Sankarachary wrote: > For one thing, list comprehensions don't let you > create immutable sequences (such as a tuple) by design. For another, it > does not provide a concise alternative to reduce() function. There was a big discussion a while back about a syntax for reduction operations. The conclusion reached was that the concept of reduction has so many degrees of freedom that, beyond the very simple cases covered by the reduce() function, nothing is clearer than simply writing out the loop explicitly. -- Greg From karthick.sankarachary at gmail.com Tue Oct 11 01:36:06 2011 From: karthick.sankarachary at gmail.com (Karthick Sankarachary) Date: Mon, 10 Oct 2011 16:36:06 -0700 Subject: [Python-ideas] Tuple Comprehensions In-Reply-To: <4E9362F7.9020105@canterbury.ac.nz> References: <4E9362F7.9020105@canterbury.ac.nz> Message-ID: Hi, All, As Masklinn pointed out, the issue is that there's no concise way to accumulate or compress an iterable without resorting to the use of the reduce() function. To answer Greg's point, it's not like list comprehensions cover all the cases allowed by the concept of mapping. There are cases, such as when the construction rule is too complicated, where comprehensions cannot be applied. Given that, why not explore the idea of if and how we can make (simple) reductions more readable. To me, the tricky part was resolving the ambiguity between a map and reduce expression in the comprehension. Given that the first argument to the reduce() function is the accumulated value, which is something that the map() function doesn't need, I felt that we could use that to make the distinction between the two in the comprehension. In particular, if the expression in the comprehension refers to the accumulated value (through the "_" variable), then it could treat it as a reduction as opposed to a mapping. Here's another example to illustrate the desired behavior, which basically adds a list of numbers (without resorting to the use of a reduce()): >>> numbers=[1,2,3,4,5] >>> [_+number for number in numbers] 15 Just to clarify, we're just talking about some syntactic sugar here, nothing more, nothing less. Best Regards, Karthick Sankarachary On Mon, Oct 10, 2011 at 2:26 PM, Greg Ewing wrote: > Karthick Sankarachary wrote: > >> For one thing, list comprehensions don't let you create immutable >> sequences (such as a tuple) by design. For another, it does not provide a >> concise alternative to reduce() function. >> > > There was a big discussion a while back about a syntax for > reduction operations. The conclusion reached was that the > concept of reduction has so many degrees of freedom that, > beyond the very simple cases covered by the reduce() function, > nothing is clearer than simply writing out the loop explicitly. > > -- > Greg > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stayvoid at gmail.com Tue Oct 11 01:48:15 2011 From: stayvoid at gmail.com (Stayvoid) Date: Tue, 11 Oct 2011 03:48:15 +0400 Subject: [Python-ideas] Help mode improvement. Message-ID: Hi there! I want to make an improvement connected with the interactive help mode. Example: You want to check some keys in the dictionary, but you can't remember the syntax of the command. If you type something like this: help(key), the interpreter will output an error. Because help(key) is just a plain expression, and it tries to evaluate key first before even calling help(). Maybe help(*key*) could make it work? In my opinion it will be very helpful for newcomers if the interpreter could search for similar commands and output them all. Code listing: help(*key*) - [1] D.has_key() [2] D.keys() ? To learn more you should type the number. I think the game is worth the candle if the suggested feature could work as fast as the current helper. Kind regards. From cmjohnson.mailinglist at gmail.com Tue Oct 11 02:42:03 2011 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Mon, 10 Oct 2011 14:42:03 -1000 Subject: [Python-ideas] Tuple Comprehensions In-Reply-To: References: <4E9362F7.9020105@canterbury.ac.nz> Message-ID: <4B756925-F59A-4A46-83FC-445D708709B5@gmail.com> On Oct 10, 2011, at 1:36 PM, Karthick Sankarachary wrote: > Hi, All, > > As Masklinn pointed out, the issue is that there's no concise way to accumulate or compress an iterable without resorting to the use of the reduce() function. > > To answer Greg's point, it's not like list comprehensions cover all the cases allowed by the concept of mapping. There are cases, such as when the construction rule is too complicated, where comprehensions cannot be applied. Given that, why not explore the idea of if and how we can make (simple) reductions more readable. ?snip? > Here's another example to illustrate the desired behavior, which basically adds a list of numbers (without resorting to the use of a reduce()): > > >>> numbers=[1,2,3,4,5] > >>> [_+number for number in numbers] > 15 Here's an even more readable version: > >>> numbers=[1,2,3,4,5] > >>> total = sum(numbers) I think the simple reductions (addition, concatenation) are already well covered by existing Python syntax. There's no need to try make sugar for anything more complicated than those two. Use a loop! -- Carl From steve at pearwood.info Tue Oct 11 03:39:31 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 11 Oct 2011 12:39:31 +1100 Subject: [Python-ideas] Help mode improvement. In-Reply-To: References: Message-ID: <4E939E53.6060702@pearwood.info> Stayvoid wrote: > Hi there! > > I want to make an improvement connected with the interactive help mode. > > Example: > You want to check some keys in the dictionary, but you can't remember > the syntax of the command. > > If you type something like this: help(key), the interpreter will > output an error. Because help(key) is just a plain > expression, and it tries to evaluate key first before even calling > help(). Maybe help(*key*) could make it work? Remember that the help() function is just a regular function, like any other. It isn't magic, and it can't accept anything that isn't a valid Python object. That won't change. help(math) won't work unless you have imported the math module first. help(key) won't work unless you actually have a name `key`, in which case it will work but it may not do what you are expecting. help(*key*) can't work because *key* is a SyntaxError. However, help('key') would work. It currently says: >>> help('key') no Python documentation found for 'key' >>> Perhaps help should be more aggressive at trying to find something useful before giving up. Or you could just encourage the beginner to use help interactively, by calling help() with no arguments, then following the prompts. > In my opinion it will be very helpful for newcomers if the interpreter > could search for similar commands and output them all. A keyword search facility, familiar to Linux users as `man -k key` or `apropos key`, might be useful. Or it might also complicate the interactive help for no good reason. Beware of trying to make help() do too much. Remember also that help() is not a substitute for the Python documentation and tutorials. It is not aimed at teaching beginners the basics. You actually need a reasonably good grasp of Python skills to get the most from the interactive help. -- Steven From ubershmekel at gmail.com Tue Oct 11 03:51:49 2011 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Mon, 10 Oct 2011 21:51:49 -0400 Subject: [Python-ideas] Help mode improvement. In-Reply-To: References: Message-ID: On Oct 10, 2011 7:48 PM, "Stayvoid" wrote: > > Hi there! > > I want to make an improvement connected with the interactive help mode. > > Example: > You want to check some keys in the dictionary, but you can't remember > the syntax of the command. > > If you type something like this: help(key), the interpreter will > output an error. Because help(key) is just a plain > expression, and it tries to evaluate key first before even calling > help(). Maybe help(*key*) could make it work? > > In my opinion it will be very helpful for newcomers if the interpreter > could search for similar commands and output them all. > > Code listing: > help(*key*) > - > [1] D.has_key() > [2] D.keys() > ? > > To learn more you should type the number. > > I think the game is worth the candle if the suggested feature could > work as fast as the current helper. > > Kind regards. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas I think a lot of things can be improved in help() but I wouldn't change the python language for that. Once in help(), that kind of builtin/types/modules/methods/docs searching should be more easily accessible. Right now you have to specifically know what you're looking for to find most things. --Yuval -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Oct 11 05:43:44 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 11 Oct 2011 13:43:44 +1000 Subject: [Python-ideas] Tuple Comprehensions In-Reply-To: References: <4E9362F7.9020105@canterbury.ac.nz> Message-ID: On Tue, Oct 11, 2011 at 9:36 AM, Karthick Sankarachary wrote: > Just to clarify, we're just talking about some syntactic sugar here, nothing > more, nothing less. And the increased cognitive burden in learning the language isn't worth it. Write a nice custom reduction function (like sum, any, all, min, max, etc) and feed it a generator expression. If you really want to argue the point, please read the following first: http://www.boredomandlaziness.org/2011/02/justifying-python-language-changes.html http://www.boredomandlaziness.org/2011/02/status-quo-wins-stalemate.html And maybe this one as well: http://www.boredomandlaziness.org/2011/04/musings-on-culture-of-python-dev.html (I guess those links may be the kind of thing that was being asked for recently when someone was wondering what kind of reaction to expect when posting suggestions to this list) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ben+python at benfinney.id.au Tue Oct 11 06:03:57 2011 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 11 Oct 2011 15:03:57 +1100 Subject: [Python-ideas] Help mode improvement. References: Message-ID: <87k48cxjgy.fsf@benfinney.id.au> Stayvoid writes: > I want to make an improvement connected with the interactive help mode. > > Example: > You want to check some keys in the dictionary, but you can't remember > the syntax of the command. In those cases, the obvious thing to do is to get help on the type, with ?help(dict)?. Your output pager has a search command; if it doesn't, get a better output pager. -- \ ?I went to the museum where they had all the heads and arms | `\ from the statues that are in all the other museums.? ?Steven | _o__) Wright | Ben Finney From mwm at mired.org Tue Oct 11 06:06:04 2011 From: mwm at mired.org (Mike Meyer) Date: Mon, 10 Oct 2011 21:06:04 -0700 Subject: [Python-ideas] Help mode improvement. In-Reply-To: <4E939E53.6060702@pearwood.info> References: <4E939E53.6060702@pearwood.info> Message-ID: <20111010210604.0b234bbc@bhuda.mired.org> On Tue, 11 Oct 2011 12:39:31 +1100 Steven D'Aprano wrote: > Stayvoid wrote: > > Hi there! > > > > I want to make an improvement connected with the interactive help mode. > > > > Example: > > You want to check some keys in the dictionary, but you can't remember > > the syntax of the command. > > > > If you type something like this: help(key), the interpreter will > > output an error. Because help(key) is just a plain > > expression, and it tries to evaluate key first before even calling > > help(). Maybe help(*key*) could make it work? > >>> help('key') > no Python documentation found for 'key' > > >>> > > Perhaps help should be more aggressive at trying to find something > useful before giving up. Having help do a search if passed a string would qualify as "more aggressive", would be useful, and wouldn't break existing usage. It should be easy to tweak the help function so that if it's passed an instance of a basic_string, it compiles that into a regular expression and uses that to try and find things. The question is - what should it be searching? top-level names in sys.modules? Docstrings in sys.modules? Something else entirely? Is there anything that can be searched here that would both be more useful than the existing interactive facility and not take an inordinate amount of time to search? http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From tjreedy at udel.edu Tue Oct 11 06:35:27 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 11 Oct 2011 00:35:27 -0400 Subject: [Python-ideas] Help mode improvement. In-Reply-To: <20111010210604.0b234bbc@bhuda.mired.org> References: <4E939E53.6060702@pearwood.info> <20111010210604.0b234bbc@bhuda.mired.org> Message-ID: On 10/11/2011 12:06 AM, Mike Meyer wrote: > On Tue, 11 Oct 2011 12:39:31 +1100 > Steven D'Aprano wrote: >> Perhaps help should be more aggressive at trying to find something >> useful before giving up. > > Having help do a search if passed a string would qualify as "more > aggressive", would be useful, and wouldn't break existing usage. This strikes me as reinventing the wheel. We already have docs with both indexes and a search facility. I usually have them open when writing Python code. -- Terry Jan Reedy From sven at marnach.net Tue Oct 11 13:34:27 2011 From: sven at marnach.net (Sven Marnach) Date: Tue, 11 Oct 2011 12:34:27 +0100 Subject: [Python-ideas] Help mode improvement. In-Reply-To: <4E939E53.6060702@pearwood.info> References: <4E939E53.6060702@pearwood.info> Message-ID: <20111011113427.GC9230@pantoffel-wg.de> Steven D'Aprano schrieb am Di, 11. Okt 2011, um 12:39:31 +1100: > A keyword search facility, familiar to Linux users as `man -k key` > or `apropos key`, might be useful. There already is "pydoc -k", also known as pydoc.help.listmodules(key="whatever") And there are some special topics like help("DICTIONARIES") for an introduction to dictionaries. I wonder if anybody *ever* found that documentation from the interactive prompt. There would certainly be a point in advertising this documentation somehow when entering the interactive help prompt. Cheers, Sven From sven at marnach.net Tue Oct 11 14:23:25 2011 From: sven at marnach.net (Sven Marnach) Date: Tue, 11 Oct 2011 13:23:25 +0100 Subject: [Python-ideas] Help mode improvement. In-Reply-To: <20111011113427.GC9230@pantoffel-wg.de> References: <4E939E53.6060702@pearwood.info> <20111011113427.GC9230@pantoffel-wg.de> Message-ID: <20111011122325.GD9230@pantoffel-wg.de> Sven Marnach schrieb am Di, 11. Okt 2011, um 12:34:27 +0100: > And there are some special topics like > > help("DICTIONARIES") > > for an introduction to dictionaries. I wonder if anybody *ever* found > that documentation from the interactive prompt. There would certainly > be a point in advertising this documentation somehow when entering the > interactive help prompt. I just noticed that these topics *are* advertised in the interactive help prompt, and can be listed by entering "topics". Apparently I never read the introduction text carefully enough. There is also keyword-based module searching by entering modules [keyword] at the help prompt. Sorry for the noise, Sven From rwwfjchuws at snkmail.com Tue Oct 11 16:33:55 2011 From: rwwfjchuws at snkmail.com (Lucas Malor) Date: Tue, 11 Oct 2011 14:33:55 +0000 Subject: [Python-ideas] Another idea for a switch statement Message-ID: <31759-1318343635-271856@sneakemail.com> Hello all. I read PEP 275 but I don't like the syntax of examples. What do you think about something like this? if x case value1 : [...] case value2 : [...] case value3 : [...] else : [...] From ron3200 at gmail.com Tue Oct 11 16:58:35 2011 From: ron3200 at gmail.com (Ron Adam) Date: Tue, 11 Oct 2011 09:58:35 -0500 Subject: [Python-ideas] Help mode improvement. In-Reply-To: <20111011122325.GD9230@pantoffel-wg.de> References: <4E939E53.6060702@pearwood.info> <20111011113427.GC9230@pantoffel-wg.de> <20111011122325.GD9230@pantoffel-wg.de> Message-ID: <1318345115.24055.5.camel@Gutsy> On Tue, 2011-10-11 at 13:23 +0100, Sven Marnach wrote: > Sven Marnach schrieb am Di, 11. Okt 2011, um 12:34:27 +0100: > > And there are some special topics like > > > > help("DICTIONARIES") > > > > for an introduction to dictionaries. I wonder if anybody *ever* found > > that documentation from the interactive prompt. There would certainly > > be a point in advertising this documentation somehow when entering the > > interactive help prompt. > > I just noticed that these topics *are* advertised in the interactive > help prompt, and can be listed by entering "topics". Apparently I > never read the introduction text carefully enough. > > There is also keyword-based module searching by entering > > modules [keyword] The help function lives in the pydoc module. Try this if you are using python 3.1 or later. It's a very nice way to look through pythons library. >>> import pydoc >>> pydoc.browse() Cheers, Ron > at the help prompt. > > Sorry for the noise, > Sven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From pyideas at rebertia.com Wed Oct 12 00:15:17 2011 From: pyideas at rebertia.com (Chris Rebert) Date: Tue, 11 Oct 2011 15:15:17 -0700 Subject: [Python-ideas] Another idea for a switch statement In-Reply-To: <31759-1318343635-271856@sneakemail.com> References: <31759-1318343635-271856@sneakemail.com> Message-ID: On Tue, Oct 11, 2011 at 7:33 AM, Lucas Malor wrote: > Hello all. I read PEP 275 but I don't like the syntax of examples. I doubt PEP 275 or PEP 3103 were rejected based just on syntax grounds. Cheers, Chris From haoyi.sg at gmail.com Wed Oct 12 06:09:38 2011 From: haoyi.sg at gmail.com (Haoyi Li) Date: Wed, 12 Oct 2011 00:09:38 -0400 Subject: [Python-ideas] Another idea for a switch statement In-Reply-To: References: <31759-1318343635-271856@sneakemail.com> Message-ID: Generally, I don't think much of switch statements as they appear in C/C++/Java. Those sorts of things can only switch on trivial data (i.e. ints, chars, strings in C#) and go against the OO grain of the language: They encourage people to store special "behavioral flags" inside objects to modify their behavior with switches, rather than using proper OO inheritance/subclassing. Furthermore, they don't really provide much of a syntactic advantage over chained elifs, or using a delegate dict (which has the advantage of being programmatically modifiable. Both of those are far more flexible and powerful. Unless we're going all the way to F#/Haskell/Scala style 'match' statements (which are completely awesome), there really isn't much point in them. -Haoyi On Tue, Oct 11, 2011 at 6:15 PM, Chris Rebert wrote: > On Tue, Oct 11, 2011 at 7:33 AM, Lucas Malor wrote: >> Hello all. I read PEP 275 but I don't like the syntax of examples. > > I doubt PEP 275 or PEP 3103 were rejected based just on syntax grounds. > > Cheers, > Chris > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From ncoghlan at gmail.com Wed Oct 12 08:28:42 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 12 Oct 2011 16:28:42 +1000 Subject: [Python-ideas] Another idea for a switch statement In-Reply-To: References: <31759-1318343635-271856@sneakemail.com> Message-ID: On Wed, Oct 12, 2011 at 8:15 AM, Chris Rebert wrote: > On Tue, Oct 11, 2011 at 7:33 AM, Lucas Malor wrote: >> Hello all. I read PEP 275 but I don't like the syntax of examples. > > I doubt PEP 275 or PEP 3103 were rejected based just on syntax grounds. Indeed, they were not. PEP 3103 elaborates on the wide range of design decisions that need to be made in crafting a switch statement for Python. For most of them, all of the available options have some significant downsides. Add in the fact that the desire to use a long if/elif chain in Python often suggests a deeper architectural problem in the relevant code (usually something like "when using Python, write Python, not C"), the idea of adding a new and complicated construct to the language for such a niche use case doesn't garner much support. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From sven at marnach.net Wed Oct 12 18:31:44 2011 From: sven at marnach.net (Sven Marnach) Date: Wed, 12 Oct 2011 17:31:44 +0100 Subject: [Python-ideas] Implement comparison operators for range objects Message-ID: <20111012163144.GB6393@pantoffel-wg.de> There are circumstances, for example in unit testing, when it might be useful to check if two range objects describe the same range. Currently, this can't be done using the '==' operator: >>> range(5) == range(5) False To get a useful comparison, you would either need to realise both range objects as lists or use a function like def ranges_equal(r0, r1): if not r0: return not r1 return len(r0) == len(r1) and r0[0] == r1[0] and r0[-1] == r1 [-1] All other built-in sequence types (that is bytearray, bytes, list, str, and tuple) define equality by "all items of the sequence are equal". I think it would be both more consistent and more useful if range objects would pick up the same semantics. When implementing '==' and '!=' for range objects, it would be natural to implement the other comparison operators, too (lexicographically, as for all other sequence types). This change would be backwards incompatible, but I very much doubt there is much code out there relying on the current behaviour of considering two ranges as unequal just because they aren't the same object (and this code could be easily fixed by using 'is' instead of '=='). Opinions? -- Sven From ron3200 at gmail.com Wed Oct 12 18:50:23 2011 From: ron3200 at gmail.com (Ron Adam) Date: Wed, 12 Oct 2011 11:50:23 -0500 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: <20111012163144.GB6393@pantoffel-wg.de> References: <20111012163144.GB6393@pantoffel-wg.de> Message-ID: <1318438223.27708.5.camel@Gutsy> On Wed, 2011-10-12 at 17:31 +0100, Sven Marnach wrote: > There are circumstances, for example in unit testing, when it might be > useful to check if two range objects describe the same range. > Currently, this can't be done using the '==' operator: > > >>> range(5) == range(5) > False Would comparing the repr of each one work? >>> r1 = range(5) >>> r2 = range(5) >>> r1 == r2 False >>> repr(r1), repr(r2) ('range(0, 5)', 'range(0, 5)') >>> repr(r1) == repr(r2) True Cheers, Ron From steve at pearwood.info Wed Oct 12 19:33:49 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 13 Oct 2011 04:33:49 +1100 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: <20111012163144.GB6393@pantoffel-wg.de> References: <20111012163144.GB6393@pantoffel-wg.de> Message-ID: <4E95CF7D.6000000@pearwood.info> Sven Marnach wrote: > There are circumstances, for example in unit testing, when it might be > useful to check if two range objects describe the same range. > Currently, this can't be done using the '==' operator: > > >>> range(5) == range(5) > False [...] > All other built-in sequence types (that is bytearray, bytes, list, > str, and tuple) define equality by "all items of the sequence are > equal". I think it would be both more consistent and more useful if > range objects would pick up the same semantics. I can see how that would be useful and straightforward, and certainly more useful than identity based equality. +1 > When implementing '==' and '!=' for range objects, it would be natural > to implement the other comparison operators, too (lexicographically, > as for all other sequence types). I don't agree. Equality makes sense for ranges: two ranges are equal if they have the same start, stop and step values. But order comparisons don't have any sensible meaning: range objects are numeric ranges, integer-valued intervals, not generic lists, and it is meaningless to say that one range is less than or greater than another. Which is greater? range(1, 1000000, 1000) range(1000, 10000) The question makes no sense, and should be treated as an error, just as it is for complex numbers. -1 on adding order comparison operators. Aside: I'm astonished to see that range objects have a count method! What's the purpose of that? Any value's count will either be 0 or 1, and a more appropriate test would be `value in range`: >>> 17 in range(2, 30, 3) # like r.count(17) => 1 True >>> 18 in range(2, 30, 3) # like r.count(18) => 0 False -- Steven From guido at python.org Wed Oct 12 19:41:09 2011 From: guido at python.org (Guido van Rossum) Date: Wed, 12 Oct 2011 10:41:09 -0700 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: <4E95CF7D.6000000@pearwood.info> References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> Message-ID: On Wed, Oct 12, 2011 at 10:33 AM, Steven D'Aprano wrote: > Sven Marnach wrote: >> >> There are circumstances, for example in unit testing, when it might be >> useful to check if two range objects describe the same range. >> Currently, this can't be done using the '==' operator: >> >> ? ?>>> range(5) == range(5) >> ? ?False > > [...] >> >> All other built-in sequence types (that is bytearray, bytes, list, >> str, and tuple) define equality by "all items of the sequence are >> equal". ?I think it would be both more consistent and more useful if >> range objects would pick up the same semantics. > > I can see how that would be useful and straightforward, and certainly more > useful than identity based equality. +1 +1 >> When implementing '==' and '!=' for range objects, it would be natural >> to implement the other comparison operators, too (lexicographically, >> as for all other sequence types). > > I don't agree. Equality makes sense for ranges: two ranges are equal if they > have the same start, stop and step values. But order comparisons don't have > any sensible meaning: range objects are numeric ranges, integer-valued > intervals, not generic lists, and it is meaningless to say that one range is > less than or greater than another. > > Which is greater? > > range(1, 1000000, 1000) > range(1000, 10000) > > The question makes no sense, and should be treated as an error, just as it > is for complex numbers. > > -1 on adding order comparison operators. Agreed. > Aside: > > I'm astonished to see that range objects have a count method! What's the > purpose of that? Any value's count will either be 0 or 1, and a more > appropriate test would be `value in range`: > >>>> 17 in range(2, 30, 3) ?# like r.count(17) => 1 > True >>>> 18 in range(2, 30, 3) ?# like r.count(18) => 0 > False Probably because some ABC has it. -- --Guido van Rossum (python.org/~guido) From g.brandl at gmx.net Wed Oct 12 19:49:15 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 12 Oct 2011 19:49:15 +0200 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: <4E95CF7D.6000000@pearwood.info> References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> Message-ID: Am 12.10.2011 19:33, schrieb Steven D'Aprano: > Aside: > > I'm astonished to see that range objects have a count method! What's the > purpose of that? Any value's count will either be 0 or 1 A method that works on any Sequence doesn't know that. Georg From dickinsm at gmail.com Wed Oct 12 19:58:25 2011 From: dickinsm at gmail.com (Mark Dickinson) Date: Wed, 12 Oct 2011 18:58:25 +0100 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: <4E95CF7D.6000000@pearwood.info> References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> Message-ID: On Wed, Oct 12, 2011 at 6:33 PM, Steven D'Aprano wrote: > I don't agree. Equality makes sense for ranges: two ranges are equal if they > have the same start, stop and step values. Hmm. I'm not sure that it's that clear cut. The other possible definition is that two ranges are equal if they're equal as lists. Should range(0, 10, 2) and range(0, 9, 2) be considered equal, or not? Agreed that it makes more sense to implement equality for ranges than the order comparisons. Mark From alexander.belopolsky at gmail.com Wed Oct 12 20:04:27 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 12 Oct 2011 14:04:27 -0400 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> Message-ID: On Wed, Oct 12, 2011 at 1:58 PM, Mark Dickinson wrote: .. > Should range(0, 10, 2) and range(0, 9, 2) be considered equal, or not? I was going to ask the same question. I think ranges r1 and r2 should be considered equal iff list(r1) == list(r2). This is slightly harder to implement than just naively comparing (start, stop, step) tuples, but the advantage is that people won't run into surprises when they port 2.x code where result of range() is a list. From g.brandl at gmx.net Wed Oct 12 20:23:37 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 12 Oct 2011 20:23:37 +0200 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> Message-ID: Am 12.10.2011 20:04, schrieb Alexander Belopolsky: > On Wed, Oct 12, 2011 at 1:58 PM, Mark Dickinson wrote: > .. >> Should range(0, 10, 2) and range(0, 9, 2) be considered equal, or not? > > I was going to ask the same question. I think ranges r1 and r2 should > be considered equal iff list(r1) == list(r2). This is slightly harder > to implement than just naively comparing (start, stop, step) tuples, > but the advantage is that people won't run into surprises when they > port 2.x code where result of range() is a list. Definitely, yes. Georg From bruce at leapyear.org Wed Oct 12 20:30:17 2011 From: bruce at leapyear.org (Bruce Leban) Date: Wed, 12 Oct 2011 11:30:17 -0700 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: <20111012163144.GB6393@pantoffel-wg.de> References: <20111012163144.GB6393@pantoffel-wg.de> Message-ID: On Wed, Oct 12, 2011 at 9:31 AM, Sven Marnach wrote: > There are circumstances, for example in unit testing, when it might be > useful to check if two range objects describe the same range. Other than unit testing, what are the use cases? If I was writing a unit test, I'd be inclined to be very explicit about what I meant r1 is r2 repr(r1) == repr(r2) list(r1) == list(r2) Absent another use case, -1 --- Bruce w00t! Gruyere security codelab graduated from Google Labs! http://j.mp/googlelabs-gruyere Not to late to give it a 5-star rating if you like it. :-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From sven at marnach.net Wed Oct 12 21:36:50 2011 From: sven at marnach.net (Sven Marnach) Date: Wed, 12 Oct 2011 20:36:50 +0100 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: <4E95CF7D.6000000@pearwood.info> References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> Message-ID: <20111012193650.GD6393@pantoffel-wg.de> Steven D'Aprano schrieb am Do, 13. Okt 2011, um 04:33:49 +1100: > >When implementing '==' and '!=' for range objects, it would be natural > >to implement the other comparison operators, too (lexicographically, > >as for all other sequence types). > > I don't agree. Equality makes sense for ranges: two ranges are equal > if they have the same start, stop and step values. No, two ranges should be equal if they represent the same sequence, i.e. if they compare equal when converted to a list: range(0) == range(4, 4, 4) range(5, 10, 3) == range(5, 11, 3) range(3, 6, 3) == range(3, 4) > But order > comparisons don't have any sensible meaning: range objects are > numeric ranges, integer-valued intervals, not generic lists, and it > is meaningless to say that one range is less than or greater than > another. Well, it's meaningless unless you define what it means. Range objects are equal if they compare equal after converting to a list. You could define '<' or '>' the same way. All built-in sequence types support lexicographical comparison, so I thought it would be natural to bring the only one that behaves differently in line. (Special cases aren't special enough...) This is just to explain my thoughts, I don't have a strong opinion on this one. I'll try and prepare a patch for '==' and '!=' and add it to the issue tracker. Cheers, Sven From sven at marnach.net Wed Oct 12 22:33:26 2011 From: sven at marnach.net (Sven Marnach) Date: Wed, 12 Oct 2011 21:33:26 +0100 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> Message-ID: <20111012203326.GE6393@pantoffel-wg.de> Bruce Leban schrieb am Mi, 12. Okt 2011, um 11:30:17 -0700: > Other than unit testing, what are the use cases? If I was writing a unit > test, I'd be inclined to be very explicit about what I meant > r1 is r2 > repr(r1) == repr(r2) > list(r1) == list(r2) Even with a useful '==' operator defined, you could still use 'r1 == r2' or 'r1 is r2', depending on the intended semnatics, just as with every other data type. You just wouldn't need to expand the range to a list. Comparing the representations doesn't ever seem useful, though. The only way to access the original start, stop and step values is by parsing the representation, and these values don't affect the behaviour of the range object in any other way. Moreover, they might even change implicitly: >>> range(5, 10, 3) range(5, 10, 3) >>> range(5, 10, 3)[:] range(5, 11, 3) I can't imagine any situation which I would like to consider the above two ranges different in. Cheers, Sven From sven at marnach.net Wed Oct 12 22:38:11 2011 From: sven at marnach.net (Sven Marnach) Date: Wed, 12 Oct 2011 21:38:11 +0100 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: <20111012203326.GE6393@pantoffel-wg.de> References: <20111012163144.GB6393@pantoffel-wg.de> <20111012203326.GE6393@pantoffel-wg.de> Message-ID: <20111012203811.GF6393@pantoffel-wg.de> Sven Marnach schrieb am Mi, 12. Okt 2011, um 21:33:26 +0100: > The only way to access the original start, stop and step values is > by parsing the representation, and these values don't affect the > behaviour of the range object in any other way. start, stop and step of course *do* affect the behaviour of the range object. What I meant is that the only way to tell the difference between two range objects defining the same sequence but creates with different values of start, stop and step is by looking at the representation. -- Sven From bruce at leapyear.org Wed Oct 12 22:55:21 2011 From: bruce at leapyear.org (Bruce Leban) Date: Wed, 12 Oct 2011 13:55:21 -0700 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: <20111012203326.GE6393@pantoffel-wg.de> References: <20111012163144.GB6393@pantoffel-wg.de> <20111012203326.GE6393@pantoffel-wg.de> Message-ID: On Wed, Oct 12, 2011 at 1:33 PM, Sven Marnach wrote: > Comparing the representations doesn't ever seem useful, though. The > only way to access the original start, stop and step values is by > parsing the representation, and these values don't affect the > behaviour of the range object in any other way. Moreover, they might > even change implicitly: > > >>> range(5, 10, 3) > range(5, 10, 3) > >>> range(5, 10, 3)[:] > range(5, 11, 3) > > I can't imagine any situation which I would like to consider the above > two ranges different in. > def test_copy_range(self): """Make sure that every time we call copy_range we get a new identical copy of the range.""" a = range(5, 10, 3) b = copy_range(a) c = copy_range(a) self.assert(a is not b) self.assert(a is not c) self.assert(b is not c) self.assert(repr(a) == repr(b)) self.assert(repr(a) == repr(c)) Anyway, my thought is that if you think this change should be made it would be helpful to have a use case other than unit tests as for those purposes, explicit list() or repr() is more clear and performance is not typically an issue. Why would you normally be comparing ranges at all? --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Wed Oct 12 23:12:28 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 12 Oct 2011 14:12:28 -0700 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: <20111012203326.GE6393@pantoffel-wg.de> References: <20111012163144.GB6393@pantoffel-wg.de> <20111012203326.GE6393@pantoffel-wg.de> Message-ID: <4E9602BC.40206@stoneleaf.us> Sven Marnach wrote: > Bruce Leban schrieb am Mi, 12. Okt 2011, um 11:30:17 -0700: >> Other than unit testing, what are the use cases? If I was writing a unit >> test, I'd be inclined to be very explicit about what I meant >> r1 is r2 >> repr(r1) == repr(r2) >> list(r1) == list(r2) > > Even with a useful '==' operator defined, you could still use 'r1 == > r2' or 'r1 is r2', depending on the intended semnatics, just as with > every other data type. You just wouldn't need to expand the range to > a list. > > Comparing the representations doesn't ever seem useful, though. Agreed -- comparing repr()s seems like a horrible way to do it. As far as comparing for equality, there's an excellent answer on StackOverflow -- http://stackoverflow.com/questions/7740796 def ranges_equal(a, b): return len(a)==len(b) and (len(a)==0 or a[0]==b[0] and a[-1]==b[-1]) ~Ethan~ From guido at python.org Thu Oct 13 00:38:38 2011 From: guido at python.org (Guido van Rossum) Date: Wed, 12 Oct 2011 15:38:38 -0700 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: <4E9602BC.40206@stoneleaf.us> References: <20111012163144.GB6393@pantoffel-wg.de> <20111012203326.GE6393@pantoffel-wg.de> <4E9602BC.40206@stoneleaf.us> Message-ID: I beg to differ with all those who want range(0, 10, 2) == range(0, 11, 2). After all the repr() shows the end point that was requested, not the end point after "normalization" (or whatever you'd call it) so the three parameters that went in should be considered state. OTOH range(10) == range(0, 10) == range(0, 10, 1). -- --Guido van Rossum (python.org/~guido) From p.f.moore at gmail.com Thu Oct 13 00:48:47 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 12 Oct 2011 23:48:47 +0100 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: <4E9602BC.40206@stoneleaf.us> References: <20111012163144.GB6393@pantoffel-wg.de> <20111012203326.GE6393@pantoffel-wg.de> <4E9602BC.40206@stoneleaf.us> Message-ID: On 12 October 2011 22:12, Ethan Furman wrote: > Agreed -- comparing repr()s seems like a horrible way to do it. > > As far as comparing for equality, there's an excellent answer on > StackOverflow -- http://stackoverflow.com/questions/7740796 > > def ranges_equal(a, b): > ?return len(a)==len(b) and (len(a)==0 or a[0]==b[0] and a[-1]==b[-1]) While I'm agnostic on the question if whether range(0,9,2) and range(0,10,2) are the same, I'd point out that ranges_equal is straightforward to write and says they are equal. But if you're in the camp of saying they are not equal, you appear to have no way of determining that *except* by comparing reprs, as range objects don't seem to expose their start, step and end values as attributes - unless I've missed something. >>> r = range(0,10,2) >>> dir(r) ['__class__', '__contains__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'count', 'index'] Rather than worrying about supporting equality operators on ranges, I'd suggest exposing the start, step and end attributes and then leaving people who want them to roll their own equality functions. Paul. From ethan at stoneleaf.us Thu Oct 13 00:57:53 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 12 Oct 2011 15:57:53 -0700 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <20111012203326.GE6393@pantoffel-wg.de> <4E9602BC.40206@stoneleaf.us> Message-ID: <4E961B71.70707@stoneleaf.us> Guido van Rossum wrote: > I beg to differ with all those who want range(0, 10, 2) == range(0, > 11, 2). I think practicality should beat purity here -- if the same results will be generated, then the ranges are the same and should be equal, no matter which exact parameters were used to create them. > After all the repr() shows the end point that was requested, > not the end point after "normalization" (or whatever you'd call it) so > the three parameters that went in should be considered state. > > OTOH range(10) == range(0, 10) == range(0, 10, 1). Exactly. ~Ethan~ From guido at python.org Thu Oct 13 02:19:31 2011 From: guido at python.org (Guido van Rossum) Date: Wed, 12 Oct 2011 17:19:31 -0700 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: <4E961B71.70707@stoneleaf.us> References: <20111012163144.GB6393@pantoffel-wg.de> <20111012203326.GE6393@pantoffel-wg.de> <4E9602BC.40206@stoneleaf.us> <4E961B71.70707@stoneleaf.us> Message-ID: On Wed, Oct 12, 2011 at 3:57 PM, Ethan Furman wrote: > Guido van Rossum wrote: >> >> I beg to differ with all those who want range(0, 10, 2) == range(0, >> 11, 2). > > I think practicality should beat purity here -- if the same results will be > generated, then the ranges are the same and should be equal, no matter which > exact parameters were used to create them. Then we'd be forever stuck with not exporting the start/stop/step values. I'd much rather export those (like the slice() object does). (IOW I find the lack of exported start/stop/step values an omission, not a feature, and would like to fix that too.) >> After all the repr() shows the end point that was requested, >> not the end point after "normalization" (or whatever you'd call it) so >> the three parameters that went in should be considered state. >> >> OTOH range(10) == range(0, 10) == range(0, 10, 1). > > Exactly. Because their repr() is the same: "range(0, 10)", thus proving that the internal state is the same. For range objects, I believe that the internal state represents what theey really "mean", and the sequence of vallues generated by iterating merely follows. -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Thu Oct 13 02:22:02 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 13 Oct 2011 10:22:02 +1000 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) Message-ID: After some interesting conversations at PyCodeConf, I'm killing PEP 3150 (Statement Local Namespaces). It's too big, too unwieldy, too confusing and too hard to implement to ever be a good idea. PEP 403 is a far simpler idea, that looks to decorators (and Ruby blocks) for inspiration. It's still a far from perfect idea, but it has a lot more going for it than PEP 3150 ever did. The basic question to ask yourself is this: What if we had a syntax that allowed us to replace the final "name = obj" step that is implicit in function and class definitions with an alternative statement, and had a symbol that allowed us to refer to the function or class in that statement without having to repeat the name? The new PEP is included below and is also available online: http://www.python.org/dev/peps/pep-0403/ I would *love* for people to dig through their callback based code (and any other examples of "single use" functions and classes) to see if this idea would help them. Cheers, Nick. PEP: 403 Title: Statement local classes and functions Version: $Revision$ Last-Modified: $Date$ Author: Nick Coghlan Status: Deferred Type: Standards Track Content-Type: text/x-rst Created: 2011-10-13 Python-Version: 3.x Post-History: 2011-10-13 Resolution: TBD Abstract ======== This PEP proposes the addition of ':' as a new class and function prefix syntax (analogous to decorators) that permits a statement local function or class definition to be appended to any Python statement that currently does not have an associated suite. In addition, the new syntax would allow the '@' symbol to be used to refer to the statement local function or class without needing to repeat the name. When the ':' prefix syntax is used, the associated statement would be executed *instead of* the normal local name binding currently implicit in function and class definitions. This PEP is based heavily on many of the ideas in PEP 3150 (Statement Local Namespaces) so some elements of the rationale will be familiar to readers of that PEP. That PEP has now been withdrawn in favour of this one. PEP Deferral ============ Like PEP 3150, this PEP currently exists in a deferred state. Unlike PEP 3150, this isn't because I suspect it might be a terrible idea or see nasty problems lurking in the implementation (aside from one potential parsing issue). Instead, it's because I think fleshing out the concept, exploring syntax variants, creating a reference implementation and generally championing the idea is going to require more time than I can give it in the 3.3 time frame. So, it's deferred. If anyone wants to step forward to drive the PEP for 3.3, let me know and I can add you as co-author and move it to Draft status. Basic Examples ============== Before diving into the long history of this problem and the detailed rationale for this specific proposed solution, here are a few simple examples of the kind of code it is designed to simplify. As a trivial example, weakref callbacks could be defined as follows:: :x = weakref.ref(obj, @) def report_destruction(obj): print("{} is being destroyed".format(obj)) This contrasts with the current repetitive "out of order" syntax for this operation:: def report_destruction(obj): print("{} is being destroyed".format(obj)) x = weakref.ref(obj, report_destruction) That structure is OK when you're using the callable multiple times, but it's irritating to be forced into it for one-off operations. Similarly, singleton classes could now be defined as:: :instance = @() class OnlyOneInstance: pass Rather than:: class OnlyOneInstance: pass instance = OnlyOneInstance() And the infamous accumulator example could become:: def counter(): x = 0 :return @ def increment(): nonlocal x x += 1 return x Proposal ======== This PEP proposes the addition of an optional block prefix clause to the syntax for function and class definitions. This block prefix would be introduced by a leading ``:`` and would be allowed to contain any simple statement (including those that don't make any sense in that context - while such code would be legal, there wouldn't be any point in writing it). The decorator symbol ``@`` would be repurposed inside the block prefix to refer to the function or class being defined. When a block prefix is provided, it *replaces* the standard local name binding otherwise implicit in a class or function definition. Background ========== The question of "multi-line lambdas" has been a vexing one for many Python users for a very long time, and it took an exploration of Ruby's block functionality for me to finally understand why this bugs people so much: Python's demand that the function be named and introduced before the operation that needs it breaks the developer's flow of thought. They get to a point where they go "I need a one-shot operation that does ", and instead of being able to just *say* that, they instead have to back up, name a function to do , then call that function from the operation they actually wanted to do in the first place. Lambda expressions can help sometimes, but they're no substitute for being able to use a full suite. Ruby's block syntax also heavily inspired the style of the solution in this PEP, by making it clear that even when limited to *one* anonymous function per statement, anonymous functions could still be incredibly useful. Consider how many constructs Python has where one expression is responsible for the bulk of the heavy lifting: * comprehensions, generator expressions, map(), filter() * key arguments to sorted(), min(), max() * partial function application * provision of callbacks (e.g. for weak references) * array broadcast operations in NumPy However, adopting Ruby's block syntax directly won't work for Python, since the effectiveness of Ruby's blocks relies heavily on various conventions in the way functions are *defined* (specifically, Ruby's ``yield`` syntax to call blocks directly and the ``&arg`` mechanism to accept a block as a functions final argument. Since Python has relied on named functions for so long, the signatures of APIs that accept callbacks are far more diverse, thus requiring a solution that allows anonymous functions to be slotted in at the appropriate location. Relation to PEP 3150 ==================== PEP 3150 (Statement Local Namespaces) described its primary motivation as being to elevate ordinary assignment statements to be on par with ``class`` and ``def`` statements where the name of the item to be defined is presented to the reader in advance of the details of how the value of that item is calculated. This PEP achieves the same goal in a different way, by allowing the simple name binding of a standard function definition to be replaced with something else (like assigning the result of the function to a value). This PEP also achieves most of the other effects described in PEP 3150 without introducing a new brainbending kind of scope. All of the complex scoping rules in PEP 3150 are replaced in this PEP with the simple ``@`` reference to the statement local function or class definition. Symbol Choice ============== The ':' symbol was chosen due to its existing presence in Python and its association with 'functions in expressions' via ``lambda`` expressions. The past Simple Implicit Lambda proposal (PEP ???) was also a factor. The proposal definitely requires *some* kind of prefix to avoid parsing ambiguity and backwards compatibility problems and ':' at least has the virtue of brevity. There's no obious alternative symbol that offers a clear improvement. Introducing a new keyword is another possibility, but I haven't come up with one that really has anything to offer over the leading colon. Syntax Change ============= Current:: atom: ('(' [yield_expr|testlist_comp] ')' | '[' [testlist_comp] ']' | '{' [dictorsetmaker] '}' | NAME | NUMBER | STRING+ | '...' | 'None' | 'True' | 'False') Changed:: atom: ('(' [yield_expr|testlist_comp] ')' | '[' [testlist_comp] ']' | '{' [dictorsetmaker] '}' | NAME | NUMBER | STRING+ | '...' | 'None' | 'True' | 'False' | '@') New:: blockprefix: ':' simple_stmt block: blockprefix (decorated | classdef | funcdef) The above is the general idea, but I suspect that change to the 'atom' definition would cause an ambiguity problem in the parser when it comes to detecting decorator lines. So the actual implementation would be more complex than that. Grammar: http://hg.python.org/cpython/file/default/Grammar/Grammar Possible Implementation Strategy ================================ This proposal has one titanic advantage over PEP 3150: implementation should be relatively straightforward. Both the class and function definition statements emit code to perform the local name binding for their defined name. Implementing this PEP should just require intercepting that code generation and replacing it with the code in the block prefix. The one potentially tricky part is working out how to allow the dual use of '@' without rewriting half the grammar definition. More Examples ============= Calculating attributes without polluting the local namespace (from os.py):: # Current Python (manual namespace cleanup) def _createenviron(): ... # 27 line function environ = _createenviron() del _createenviron # Becomes: :environ = @() def _createenviron(): ... # 27 line function Loop early binding:: # Current Python (default argument hack) funcs = [(lambda x, i=i: x + i) for i in range(10)] # Becomes: :funcs = [@(i) for i in range(10)] def make_incrementor(i): return lambda x: x + i # Or even: :funcs = [@(i) for i in range(10)] def make_incrementor(i): :return @ def incrementor(x): return x + i Reference Implementation ======================== None as yet. TO DO ===== Sort out links and references to everything :) Acknowledgements ================ Huge thanks to Gary Bernhardt for being blunt in pointing out that I had no idea what I was talking about in criticising Ruby's blocks, kicking off a rather enlightening process of investigation. References ========== TBD Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From aquavitae69 at gmail.com Thu Oct 13 02:22:09 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Thu, 13 Oct 2011 02:22:09 +0200 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> Message-ID: On Oct 12, 2011 9:37 PM, "Sven Marnach" wrote: > > Steven D'Aprano schrieb am Do, 13. Okt 2011, um 04:33:49 +1100: > > >When implementing '==' and '!=' for range objects, it would be natural > > >to implement the other comparison operators, too (lexicographically, > > >as for all other sequence types). > > > > I don't agree. Equality makes sense for ranges: two ranges are equal > > if they have the same start, stop and step values. > > No, two ranges should be equal if they represent the same sequence, > i.e. if they compare equal when converted to a list: > > range(0) == range(4, 4, 4) > range(5, 10, 3) == range(5, 11, 3) > range(3, 6, 3) == range(3, 4) > > > But order > > comparisons don't have any sensible meaning: range objects are > > numeric ranges, integer-valued intervals, not generic lists, and it > > is meaningless to say that one range is less than or greater than > > another. > > Well, it's meaningless unless you define what it means. Range objects > are equal if they compare equal after converting to a list. You could > define '<' or '>' the same way. All built-in sequence types support > lexicographical comparison, so I thought it would be natural to bring > the only one that behaves differently in line. (Special cases aren't > special enough...) > > This is just to explain my thoughts, I don't have a strong opinion on > this one. > > I'll try and prepare a patch for '==' and '!=' and add it to the issue > tracker. > > Cheers, > Sven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas If you consider a range to represent a special type of set, which it is since it always contains unique values, then comparison operators do make sense. E.g. range(4,8) < range(2,9) is a subset comparison. +1 for equality checks David -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Oct 13 02:39:45 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 13 Oct 2011 10:39:45 +1000 Subject: [Python-ideas] PEP 355 (overloading boolean operations) and chained comparisons Message-ID: I had a chance to speak to Travis Oliphant (NumPy core dev) at PyCodeConf and asked him his opinion of PEP 355. His answer was that he didn't really care about overloading boolean operations in general (the bitwise operation overloads with the appropriate objects in the arrays were adequate for most purposes), but the fact that chained comparisons don't work for NumPy arrays was genuinely annoying. That is, if you have a NumPy array, you cannot write: x = A < B < C Since, under the covers, that translates to: x = A < B and B < C and the result of the first operation will be an array and hence always true, so 'x' receives the value 'True' rather than an array with the broadcast chained comparison. Instead, you have to write out the chained comparison explicitly, including the repetition of the middle expression and the extra parentheses to avoid the precedence problems with the bitwise operators: x = (A < B) & (B < C) PEP 355 would allow NumPy to fix that by overriding the logical 'and' operation that is implicit in chained comparisons to force evaluation of the RHS and return the rich result. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From guido at python.org Thu Oct 13 03:06:54 2011 From: guido at python.org (Guido van Rossum) Date: Wed, 12 Oct 2011 18:06:54 -0700 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> Message-ID: On Wed, Oct 12, 2011 at 5:22 PM, David Townshend wrote: > > On Oct 12, 2011 9:37 PM, "Sven Marnach" wrote: >> >> Steven D'Aprano schrieb am Do, 13. Okt 2011, um 04:33:49 +1100: >> > >When implementing '==' and '!=' for range objects, it would be natural >> > >to implement the other comparison operators, too (lexicographically, >> > >as for all other sequence types). >> > >> > I don't agree. Equality makes sense for ranges: two ranges are equal >> > if they have the same start, stop and step values. >> >> No, two ranges should be equal if they represent the same sequence, >> i.e. if they compare equal when converted to a list: >> >> ? ?range(0) == range(4, 4, 4) >> ? ?range(5, 10, 3) == range(5, 11, 3) >> ? ?range(3, 6, 3) == range(3, 4) >> >> > But order >> > comparisons don't have any sensible meaning: range objects are >> > numeric ranges, integer-valued intervals, not generic lists, and it >> > is meaningless to say that one range is less than or greater than >> > another. >> >> Well, it's meaningless unless you define what it means. ?Range objects >> are equal if they compare equal after converting to a list. ?You could >> define '<' or '>' the same way. ?All built-in sequence types support >> lexicographical comparison, so I thought it would be natural to bring >> the only one that behaves differently in line. ?(Special cases aren't >> special enough...) >> >> This is just to explain my thoughts, I don't have a strong opinion on >> this one. >> >> I'll try and prepare a patch for '==' and '!=' and add it to the issue >> tracker. >> >> Cheers, >> ? ?Sven >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas > > If you consider a range to represent a special type of set, which it is > since it always contains unique values, then comparison operators do make > sense. E.g. range(4,8) < range(2,9) is a subset comparison. But it's *not* a set. It's got a definite order. range(10) != range(9, -1, -1) even though they contain the same values. > +1 for equality checks Yeah, we're down to bikeshedding about whether range(0, 10, 2) == range(0, 11, 2). -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Thu Oct 13 03:43:53 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 13 Oct 2011 11:43:53 +1000 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> Message-ID: On Thu, Oct 13, 2011 at 11:06 AM, Guido van Rossum wrote: > Yeah, we're down to bikeshedding about whether range(0, 10, 2) == > range(0, 11, 2). I'll weigh in on the "compare like a sequence" side, even if the specific range definitions are different. It's the way range comparisons work in Python 2 and I'd like range() objects to be as close to a computationally defined immutable list as we can get them. It may even make sense to make them hashable in those terms. I see it as similar to the fact that "Decimal('1') == Decimal('1.0')" even though those two objects carry additional state regarding significant digits that the definition of equivalence ignores. But not exposing start/stop/step is a definite oversight - I actually thought we *did* expose them, but I was thinking of slice objects. With those attributes exposed, anyone that wants a more restrictive form of equality can easily implement it for themselves. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ericsnowcurrently at gmail.com Thu Oct 13 03:45:01 2011 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 12 Oct 2011 19:45:01 -0600 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <20111012203326.GE6393@pantoffel-wg.de> <4E9602BC.40206@stoneleaf.us> Message-ID: On Wed, Oct 12, 2011 at 4:48 PM, Paul Moore wrote: > On 12 October 2011 22:12, Ethan Furman wrote: >> Agreed -- comparing repr()s seems like a horrible way to do it. >> >> As far as comparing for equality, there's an excellent answer on >> StackOverflow -- http://stackoverflow.com/questions/7740796 >> >> def ranges_equal(a, b): >> ?return len(a)==len(b) and (len(a)==0 or a[0]==b[0] and a[-1]==b[-1]) > > While I'm agnostic on the question if whether range(0,9,2) and > range(0,10,2) are the same, I'd point out that ranges_equal is > straightforward to write and says they are equal. But if you're in the > camp of saying they are not equal, you appear to have no way of > determining that *except* by comparing reprs, as range objects don't > seem to expose their start, step and end values as attributes - unless > I've missed something. > >>>> r = range(0,10,2) >>>> dir(r) > ['__class__', '__contains__', '__delattr__', '__doc__', '__eq__', > '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', > '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', > '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', > '__reversed__', '__setattr__', '__sizeof__', '__str__', > '__subclasshook__', 'count', 'index'] > > Rather than worrying about supporting equality operators on ranges, > I'd suggest exposing the start, step and end attributes and then > leaving people who want them to roll their own equality functions. Unless I misunderstood, Guido is basically saying the same thing (the "exposing" part, that is). +1 on exposing start, step and end +1 on leaving it at that (unless it turns out to be a common case) -eric > > Paul. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From python at mrabarnett.plus.com Thu Oct 13 04:05:59 2011 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 13 Oct 2011 03:05:59 +0100 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> Message-ID: <4E964787.40607@mrabarnett.plus.com> On 13/10/2011 02:43, Nick Coghlan wrote: > On Thu, Oct 13, 2011 at 11:06 AM, Guido van Rossum wrote: >> Yeah, we're down to bikeshedding about whether range(0, 10, 2) == >> range(0, 11, 2). > > I'll weigh in on the "compare like a sequence" side, even if the > specific range definitions are different. It's the way range > comparisons work in Python 2 and I'd like range() objects to be as > close to a computationally defined immutable list as we can get them. > It may even make sense to make them hashable in those terms. > +1 From ncoghlan at gmail.com Thu Oct 13 04:44:12 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 13 Oct 2011 12:44:12 +1000 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <20111012203326.GE6393@pantoffel-wg.de> <4E9602BC.40206@stoneleaf.us> Message-ID: On Thu, Oct 13, 2011 at 11:45 AM, Eric Snow wrote: > Unless I misunderstood, Guido is basically saying the same thing (the > "exposing" part, that is). > > +1 on exposing start, step and end > +1 on leaving it at that (unless it turns out to be a common case) My reading is that Guido has reserved judgment on the second part for now. Options are: - do nothing for 3.3 (+0 from me) - make sequence comparison the default (+1 from me) - make start/stop/step comparison the default (-1 from me) If we do either of the latter, range.__hash__ should be updated accordingly (since 3.x range objects are currently hashable due to their reliance on the default identity comparison) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From guido at python.org Thu Oct 13 05:18:50 2011 From: guido at python.org (Guido van Rossum) Date: Wed, 12 Oct 2011 20:18:50 -0700 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <20111012203326.GE6393@pantoffel-wg.de> <4E9602BC.40206@stoneleaf.us> Message-ID: On Wed, Oct 12, 2011 at 7:44 PM, Nick Coghlan wrote: > On Thu, Oct 13, 2011 at 11:45 AM, Eric Snow wrote: >> Unless I misunderstood, Guido is basically saying the same thing (the >> "exposing" part, that is). >> >> +1 on exposing start, step and end +1 >> +1 on leaving it at that (unless it turns out to be a common case) > > My reading is that Guido has reserved judgment on the second part for > now. Options are: > > - do nothing for 3.3 (+0 from me) > - make sequence comparison the default (+1 from me) > - make start/stop/step comparison the default (-1 from me) Actually when I wrote that I was +1 on start/stop/step comparison and -1 on sequence (really: list) comparison. But I'd like to take a step back; we should really look at the use cases for comparing range objects. Since they don't return lists, you can't compare them to lists (or rather, they're always unequal). Because of this (and because it didn't work in 3.0, 3.1, 3.2) the proposed requirement that it should work the same as it did in Python 2 doesn't sway me. So what's the most useful comparison for range objects? When comparing non-empty ranges with step 1, I think we all agree. So we're left arguing about whether all empty ranges should be equal, and about whether non-empty ranges with step > 1 should compare equal if they have the same start, step and length (regardless of the exact value of stop). But why do we compare ranges? The first message in this thread (according to GMail) mentions unittests and suggests that it would be handy to check if two ranges are the same, but does not give a concrete example of such a unittest. The code example given uses list-wise comparison, but the use case is not elaborated further. Does anyone have an actual example of a unittest where being able to compare ranges would have been handy? Or of any other real-life example? Where it matter what happens if the range is empty or step is > 1? So, let me say I'm undecided (except on the desirability of an == test for ranges that's more useful than identity). FWIW, I don't think the argument from numeric comparisons carries directly. The reason numeric comparisons (across int, float and Decimal) ignore certain "state" of the value (like precision or type) is that that's how we want our numbers to work. The open question so far is: How do we want our ranges to work? My intuition is weak, but says: range(0) != range(1, 1) != range(1, 1, 2) and range(0, 10, 2) != range(0, 11, 2); all because the arguments (after filling in the defaults) are different, and those arguments can come out using the start, stop, step attributes (once we implement them :-). > If we do either of the latter, range.__hash__ should be updated > accordingly (since 3.x range objects are currently hashable due to > their reliance on the default identity comparison) Sure. That's implied when __eq__ is updated (though a good reminder for whoever will produce the patch). (I'm also -1 on adding ordering comparisons; there's little disagreement on that issue.) PS. An (unrelated) oddity with range and Decimal: >>> range(Decimal(10)) Traceback (most recent call last): File "", line 1, in TypeError: 'Decimal' object cannot be interpreted as an integer >>> range(int(Decimal(10))) range(0, 10) >>> So int() knows something that range() doesn't. :-) -- --Guido van Rossum (python.org/~guido) From raymond.hettinger at gmail.com Thu Oct 13 06:27:21 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Wed, 12 Oct 2011 21:27:21 -0700 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: <20111012193650.GD6393@pantoffel-wg.de> References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> Message-ID: <6F015650-93C4-4B10-8ABB-E3657BC858D5@gmail.com> On Oct 12, 2011, at 12:36 PM, Sven Marnach wrote: > Steven D'Aprano schrieb am Do, 13. Okt 2011, um 04:33:49 +1100: >>> When implementing '==' and '!=' for range objects, it would be natural >>> to implement the other comparison operators, too (lexicographically, >>> as for all other sequence types). >> >> I don't agree. Equality makes sense for ranges: two ranges are equal >> if they have the same start, stop and step values. > > No, two ranges should be equal if they represent the same sequence, > i.e. if they compare equal when converted to a list: > > range(0) == range(4, 4, 4) > range(5, 10, 3) == range(5, 11, 3) > range(3, 6, 3) == range(3, 4) Given that there are two reasonably valid interpretations of equality (i.e. produces-equivalent-sequences or the stronger condition, has-an-equal-start/stop/step-tuple), we should acknowledge the ambiguity and refuse the temptation to guess. I vote for not defining equality for range objects -- it's not really an important service anyway (you really can live without it). Instead, let people explicitly compare the raw components, or if desired, compare components normalized by a slice: >>> range(2, 9, 2)[:] range(2, 10, 2) Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Thu Oct 13 06:43:18 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Wed, 12 Oct 2011 21:43:18 -0700 Subject: [Python-ideas] PEP 355 (overloading boolean operations) and chained comparisons In-Reply-To: References: Message-ID: On Oct 12, 2011, at 5:39 PM, Nick Coghlan wrote: > PEP 355 would allow NumPy to fix that by overriding the logical 'and' > operation that is implicit in chained comparisons to force evaluation > of the RHS and return the rich result. Have you considered that what-is-good-for-numpy isn't necessarily good for Python as a whole? Extended slicing and ellipsis tricks weren't so bad because they were easily ignored by general users. In contrast, rich comparisons have burdened everyone (we've paid a price in many ways). The numeric world really needs more operators than Python provides (a matrix multiplication operator for example), but I don't think Python is better-off by letting those needs leak back into the core language one-at-a-time. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From cmjohnson.mailinglist at gmail.com Thu Oct 13 06:45:26 2011 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Wed, 12 Oct 2011 18:45:26 -1000 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: Message-ID: I really like the proposal, although I would be interested to see if anyone can bikeshed up keywords that might be better than : or @? :-) I do have some questions about it. Does using @ instead of the name defined below make the implementation easier? In other words, would this cause a NameError on normalize because normalize isn't defined in the local namespace?-- :sorted_list = sorted(original, key=normalize) def normalize(item): ? Or is the @ just for brevity? I assume the point is that it's not just brevity, but you have to use the @ in order to make the implementation straightforward. -- Carl Johnson From ncoghlan at gmail.com Thu Oct 13 06:53:56 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 13 Oct 2011 14:53:56 +1000 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <20111012203326.GE6393@pantoffel-wg.de> <4E9602BC.40206@stoneleaf.us> Message-ID: On Thu, Oct 13, 2011 at 1:18 PM, Guido van Rossum wrote: > FWIW, I don't think the argument from numeric comparisons carries > directly. The reason numeric comparisons (across int, float and > Decimal) ignore certain "state" of the value (like precision or type) > is that that's how we want our numbers to work. > > The open question so far is: How do we want our ranges to work? My > intuition is weak, but says: range(0) != range(1, 1) != range(1, 1, 2) > and range(0, 10, 2) != range(0, 11, 2); all because the arguments > (after filling in the defaults) are different, and those arguments can > come out using the start, stop, step attributes (once we implement > them :-). Between this and Raymond's point about slicing permitting easy and cheap normalisation of endpoints, I'm convinced that, if we add direct comparison of ranges at all, then start/stop/step comparison is the way to go. > PS. An (unrelated) oddity with range and Decimal: > >>>> range(Decimal(10)) > Traceback (most recent call last): > ?File "", line 1, in > TypeError: 'Decimal' object cannot be interpreted as an integer >>>> range(int(Decimal(10))) > range(0, 10) >>>> > > So int() knows something that range() doesn't. :-) Yeah, range() wants to keep floats far away, so it only checks __index__, not __int__. So Decimal gets handled the same way float does (i.e. not allowed directly, but permitted after explicit coercion to an integer). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu Oct 13 07:07:40 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 13 Oct 2011 15:07:40 +1000 Subject: [Python-ideas] PEP 355 (overloading boolean operations) and chained comparisons In-Reply-To: References: Message-ID: On Thu, Oct 13, 2011 at 2:43 PM, Raymond Hettinger wrote: > > On Oct 12, 2011, at 5:39 PM, Nick Coghlan wrote: > > PEP 355 would allow NumPy to fix that by overriding the logical 'and' > operation that is implicit in chained comparisons to force evaluation > of the RHS and return the rich result. > > Have you considered that what-is-good-for-numpy isn't necessarily good for > Python as a whole? > Extended slicing and ellipsis tricks weren't so bad because they were easily > ignored by general users. ?In contrast, rich comparisons have burdened > everyone (we've paid a price in many ways). > The numeric world really needs more operators than Python provides (a matrix > multiplication operator for example), but I don't think Python is better-off > by letting those needs leak back into the core language one-at-a-time. Yeah, I'm still almost entirely negative on PEP 355 (the discussion of it only started up again because I asked Guido if we could kill it off officially rather than leaving it lingering in Open status indefinitely). I just thought the chained comparisons case was worth bringing up, since the PEP doesn't currently mention it and it's quite a subtle distinction that you can overload the binary operators to create a rich comparison operation but this overloading isn't effective in the chained comparison case due to the implicit 'and' underlying that syntax. Overall, PEP 355 still seems to be trying to swat a gnat with a sledgehammer, and that's the perspective of someone that has a long history of trying to take out language gnats with sledgehammers of his own ;) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From aquavitae69 at gmail.com Thu Oct 13 07:45:55 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Thu, 13 Oct 2011 07:45:55 +0200 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: Message-ID: How about using lambda instead of @? lambda on its own currently raises a syntax error, so this might be easier to implement that @. Also, lambda is rather more descriptive than @ since it is already used in the context of unnamed functions. A question: As I understand it, the function is never actually bound to its name, i.e. in your first example the name "report_destruction" doesn't exist after the statement. If this is the case, then there seems little point assigning a name at all other than for providing a description. In fact, assigning a name implies that it is reusable and that the name means something. I'm not sure I like the idea of allowing defs without a name, but perhaps its something to think about. So your first example could read :x = weakref.ref(obj, lambda) def (obj): print("{} is being destroyed".format(obj)) or even (reusing lambda again) :x = weakref.ref(obj, lambda) lambda (obj): print("{} is being destroyed".format(obj)) On Thu, Oct 13, 2011 at 6:45 AM, Carl M. Johnson < cmjohnson.mailinglist at gmail.com> wrote: > I really like the proposal, although I would be interested to see if anyone > can bikeshed up keywords that might be better than : or @? :-) > > I do have some questions about it. Does using @ instead of the name defined > below make the implementation easier? In other words, would this cause a > NameError on normalize because normalize isn't defined in the local > namespace?-- > > :sorted_list = sorted(original, key=normalize) > def normalize(item): > ? > > Or is the @ just for brevity? I assume the point is that it's not just > brevity, but you have to use the @ in order to make the implementation > straightforward. > > -- Carl Johnson > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Oct 13 07:48:03 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 13 Oct 2011 15:48:03 +1000 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: Message-ID: On Thu, Oct 13, 2011 at 2:45 PM, Carl M. Johnson wrote: > I really like the proposal, although I would be interested to see if anyone can bikeshed up keywords that might be better than : or @? :-) > > I do have some questions about it. Does using @ instead of the name defined below make the implementation easier? In other words, would this cause a NameError on normalize because normalize isn't defined in the local namespace?-- > > :sorted_list = sorted(original, key=normalize) > def normalize(item): > ? ? > > Or is the @ just for brevity? I assume the point is that it's not just brevity, but you have to use the @ in order to make the implementation straightforward. The brevity and highlighting the "forward reference" are actually the primary motivation, making the implementation easier (which it does do) is a nice bonus. When I first started writing up the PEP, I was using a syntax more directly inspired by PEP 3150's given clause: :sorted_list = sorted(original, key=@) given (item): ? There were a few problems with this: 1. It pushes the signature of the callable all the way over to the RHS 2. The callable is actually anonymous, so @.__name__ would always be "". That's annoying for the same reason as it's annoying in lambda expressions. 3. It doesn't provide any real clues that the body of the statement is actually a nested function So then I thought, "well what if I use 'def' instead of 'given' as the keyword"? At that point, it was only a short leap to realising that what I wanted was really close to "arbitrary simple statement as a decorator". So I rewrote things to the format in the posted PEP: :sorted_list = sorted(original, key=@) def normalise(item): ? Now that the nested function is being given a name, I realised I *could* just refer to it by that name in the block prefix. However, I left it alone for the reasons I mention above: 1. It highlights that this is not a normal name reference but something special (i.e. a forward reference to the object defined by the statement) 2. The fact that references are always a single character makes it easier to justify giving the function itself a nice name, which improves introspection support and error messages. In genuine throwaway cases, you can always use a dummy name like 'f', 'func', 'g', 'gen' or 'block' or 'attr' depending on what you're doing. 3. Multiple references don't make the block prefix unwieldy (although the use cases for those are limited) 4. It does make the implementation easier, since you don't have to worry about namespace clashes - the function remains effectively anonymous in the containing scope. It would be possible to extend the PEP to include the idea of allowing the name to be omitted in function and class definitions, but that's an awful lot of complexity when it's easy to use a throwaway name if you genuinely don't care. Just like PEP 3150, all of this is based on the premise that one of the key benefits of multi-line lambdas is that it lets you do things in the *right order* - operation first, callable second. Ordinary named functions force you to do things the other way around. Decorator abuse is also a problematic approach, since even though it gets the order right, you end up with something that looks like an ordinary function or class definition but is actually nothing of the sort. Oh, I'll also note that the class variant gives you the full power of PEP 3150 without any (especially) funky new namespace semantics: :x = property(@.get, @.set, @.delete) class scope: def get(self): return __class__.attr def set(self, val): __class__.attr = val def delete(self): del __class__.attr Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu Oct 13 07:50:58 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 13 Oct 2011 15:50:58 +1000 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: Message-ID: On Thu, Oct 13, 2011 at 3:45 PM, David Townshend wrote: > A question: As I understand it, the function is never actually bound to its > name, i.e. in your first example the name "report_destruction" doesn't exist > after the statement. If this is the case, then there seems little point > assigning a name at all other than for providing a description. In fact, > assigning a name implies that it is reusable and that the name means > something. In a language without exception tracebacks or other forms of introspection, this would be of greater concern. However, since Python has both, the lack of meaningful names is a *problem* with lambdas, not a feature. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From cmjohnson.mailinglist at gmail.com Thu Oct 13 07:51:50 2011 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Wed, 12 Oct 2011 19:51:50 -1000 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: Message-ID: On Oct 12, 2011, at 7:45 PM, David Townshend wrote: > A question: As I understand it, the function is never actually bound to its name, i.e. in your first example the name "report_destruction" doesn't exist after the statement. If this is the case, then there seems little point assigning a name at all other than for providing a description. In fact, assigning a name implies that it is reusable and that the name means something. > > I'm not sure I like the idea of allowing defs without a name, but perhaps its something to think about. -1 To me, the names are part of the documentation. The advantage of anonymous blocks is the block part, not the anonymous part. From ncoghlan at gmail.com Thu Oct 13 07:54:04 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 13 Oct 2011 15:54:04 +1000 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: Message-ID: On Thu, Oct 13, 2011 at 3:51 PM, Carl M. Johnson wrote: > > On Oct 12, 2011, at 7:45 PM, David Townshend wrote: > >> A question: As I understand it, the function is never actually bound to its name, i.e. in your first example the name "report_destruction" doesn't exist after the statement. If this is the case, then there seems little point assigning a name at all other than for providing a description. In fact, assigning a name implies that it is reusable and that the name means something. >> >> I'm not sure I like the idea of allowing defs without a name, but perhaps its something to think about. > > -1 To me, the names are part of the documentation. The advantage of anonymous blocks is the block part, not the anonymous part. The "no namespace clashes" part is another benefit. PEP 403 attacks that by omitting the name binding in the current scope rather than by omitting the name entirely. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From aquavitae69 at gmail.com Thu Oct 13 08:21:49 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Thu, 13 Oct 2011 08:21:49 +0200 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: Message-ID: After reading your earlier reply about the benefits of named functions, I'm fully with you! +1 to the PEP. On Thu, Oct 13, 2011 at 7:54 AM, Nick Coghlan wrote: > On Thu, Oct 13, 2011 at 3:51 PM, Carl M. Johnson > wrote: > > > > On Oct 12, 2011, at 7:45 PM, David Townshend wrote: > > > >> A question: As I understand it, the function is never actually bound to > its name, i.e. in your first example the name "report_destruction" doesn't > exist after the statement. If this is the case, then there seems little > point assigning a name at all other than for providing a description. In > fact, assigning a name implies that it is reusable and that the name means > something. > >> > >> I'm not sure I like the idea of allowing defs without a name, but > perhaps its something to think about. > > > > -1 To me, the names are part of the documentation. The advantage of > anonymous blocks is the block part, not the anonymous part. > > The "no namespace clashes" part is another benefit. PEP 403 attacks > that by omitting the name binding in the current scope rather than by > omitting the name entirely. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at zesty.ca Thu Oct 13 08:24:54 2011 From: python at zesty.ca (Ka-Ping Yee) Date: Wed, 12 Oct 2011 23:24:54 -0700 (PDT) Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: Message-ID: I don't see why the leading colon is necessary, actually. Isn't the presence of a bare @ enough to signal that a function or class definition must come next? It reads more cleanly and naturally without the colon, to me: return sorted(original, key=@) def normalize(item): return item.strip().lower() --Ping From cmjohnson.mailinglist at gmail.com Thu Oct 13 08:30:31 2011 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Wed, 12 Oct 2011 20:30:31 -1000 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: Message-ID: <2CEE38E9-AFC5-440F-B6EB-87D0403114AC@gmail.com> On Oct 12, 2011, at 8:24 PM, Ka-Ping Yee wrote: > I don't see why the leading colon is necessary, actually. Isn't > the presence of a bare @ enough to signal that a function or class > definition must come next? Python's parser is purposefully pretty simple, so it's not clear if that could be added without ripping up the whole language definition. From ericsnowcurrently at gmail.com Thu Oct 13 08:33:22 2011 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 13 Oct 2011 00:33:22 -0600 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: Message-ID: On Wed, Oct 12, 2011 at 11:48 PM, Nick Coghlan wrote: > On Thu, Oct 13, 2011 at 2:45 PM, Carl M. Johnson > wrote: >> I really like the proposal, although I would be interested to see if anyone can bikeshed up keywords that might be better than : or @? :-) >> >> I do have some questions about it. Does using @ instead of the name defined below make the implementation easier? In other words, would this cause a NameError on normalize because normalize isn't defined in the local namespace?-- >> >> :sorted_list = sorted(original, key=normalize) >> def normalize(item): >> ? ? >> >> Or is the @ just for brevity? I assume the point is that it's not just brevity, but you have to use the @ in order to make the implementation straightforward. > > The brevity and highlighting the "forward reference" are actually the > primary motivation, making the implementation easier (which it does > do) is a nice bonus. > > When I first started writing up the PEP, I was using a syntax more > directly inspired by PEP 3150's given clause: > > ? ?:sorted_list = sorted(original, key=@) given (item): > ? ? ? ?? > > There were a few problems with this: > 1. It pushes the signature of the callable all the way over to the RHS > 2. The callable is actually anonymous, so @.__name__ would always be > "". That's annoying for the same reason as it's annoying in > lambda expressions. > 3. It doesn't provide any real clues that the body of the statement is > actually a nested function > > So then I thought, "well what if I use 'def' instead of 'given' as the > keyword"? At that point, it was only a short leap to realising that > what I wanted was really close to "arbitrary simple statement as a > decorator". So I rewrote things to the format in the posted PEP: > > ? ?:sorted_list = sorted(original, key=@) > ? ?def normalise(item): > ? ? ? ?? > > Now that the nested function is being given a name, I realised I > *could* just refer to it by that name in the block prefix. However, I > left it alone for the reasons I mention above: > > 1. It highlights that this is not a normal name reference but > something special (i.e. a forward reference to the object defined by > the statement) > 2. The fact that references are always a single character makes it > easier to justify giving the function itself a nice name, which > improves introspection support and error messages. In genuine > throwaway cases, you can always use a dummy name like 'f', 'func', > 'g', 'gen' or 'block' or 'attr' depending on what you're doing. > 3. Multiple references don't make the block prefix unwieldy (although > the use cases for those are limited) > 4. It does make the implementation easier, since you don't have to > worry about namespace clashes - the function remains effectively > anonymous in the containing scope. I like it a lot better as a symbol than as an identifier. It visually pops out. > > It would be possible to extend the PEP to include the idea of allowing > the name to be omitted in function and class definitions, but that's > an awful lot of complexity when it's easy to use a throwaway name if > you genuinely don't care. easy: :sorted_list = sorted(original, key=@) def _(item): ? -eric > > Just like PEP 3150, all of this is based on the premise that one of > the key benefits of multi-line lambdas is that it lets you do things > in the *right order* - operation first, callable second. Ordinary > named functions force you to do things the other way around. Decorator > abuse is also a problematic approach, since even though it gets the > order right, you end up with something that looks like an ordinary > function or class definition but is actually nothing of the sort. > > Oh, I'll also note that the class variant gives you the full power of > PEP 3150 without any (especially) funky new namespace semantics: > > ? ?:x = property(@.get, @.set, @.delete) > ? ?class scope: > ? ? ? ?def get(self): > ? ? ? ? ? ?return __class__.attr > ? ? ? ?def set(self, val): > ? ? ? ? ? ?__class__.attr = val > ? ? ? ?def delete(self): > ? ? ? ? ? ?del __class__.attr > > Cheers, > Nick. > > -- > Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From ericsnowcurrently at gmail.com Thu Oct 13 08:37:11 2011 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 13 Oct 2011 00:37:11 -0600 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: <2CEE38E9-AFC5-440F-B6EB-87D0403114AC@gmail.com> References: <2CEE38E9-AFC5-440F-B6EB-87D0403114AC@gmail.com> Message-ID: On Thu, Oct 13, 2011 at 12:30 AM, Carl M. Johnson wrote: > > On Oct 12, 2011, at 8:24 PM, Ka-Ping Yee wrote: > >> I don't see why the leading colon is necessary, actually. ?Isn't >> the presence of a bare @ enough to signal that a function or class >> definition must come next? > > Python's parser is purposefully pretty simple, so it's not clear if that could be added without ripping up the whole language definition. and visually the colon _does_ make the syntax clearer. It works because it's so out of place at the beginning of a line. Scanning through code you won't miss it (and misinterpret what's going on). -eric > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From urban.dani+py at gmail.com Thu Oct 13 09:20:09 2011 From: urban.dani+py at gmail.com (Daniel Urban) Date: Thu, 13 Oct 2011 09:20:09 +0200 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> Message-ID: On Thu, Oct 13, 2011 at 03:43, Nick Coghlan wrote: > But not exposing start/stop/step is a definite oversight - I actually > thought we *did* expose them, but I was thinking of slice objects. There is already a tracker issue with a patch adding the start/stop/step attributes: http://bugs.python.org/issue9896 Daniel From g.brandl at gmx.net Thu Oct 13 09:31:35 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 13 Oct 2011 09:31:35 +0200 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: Message-ID: Am 13.10.2011 07:48, schrieb Nick Coghlan: > Oh, I'll also note that the class variant gives you the full power of > PEP 3150 without any (especially) funky new namespace semantics: > > :x = property(@.get, @.set, @.delete) > class scope: > def get(self): > return __class__.attr > def set(self, val): > __class__.attr = val > def delete(self): > del __class__.attr Sorry, I don't think this looks like Python anymore. Defining a class just to get at a throwaway namespace? Using "@" as an identifier? Using ":" not as a suite marker? This doesn't have any way for a casual reader to understand what's going on. (Ordinary decorators are bad enough, but I think it is possible to grasp that the @foo stuff is some kind of "annotation".) Georg From ericsnowcurrently at gmail.com Thu Oct 13 09:32:56 2011 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 13 Oct 2011 01:32:56 -0600 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: Message-ID: On Wed, Oct 12, 2011 at 6:22 PM, Nick Coghlan wrote: > After some interesting conversations at PyCodeConf, I'm killing PEP > 3150 (Statement Local Namespaces). It's too big, too unwieldy, too > confusing and too hard to implement to ever be a good idea. > > PEP 403 is a far simpler idea, that looks to decorators (and Ruby > blocks) for inspiration. It's still a far from perfect idea, but it > has a lot more going for it than PEP 3150 ever did. > > The basic question to ask yourself is this: What if we had a syntax > that allowed us to replace the final "name = obj" step that is > implicit in function and class definitions with an alternative > statement, and had a symbol that allowed us to refer to the function > or class in that statement without having to repeat the name? > > The new PEP is included below and is also available online: > http://www.python.org/dev/peps/pep-0403/ > > I would *love* for people to dig through their callback based code > (and any other examples of "single use" functions and classes) to see > if this idea would help them. > > Cheers, > Nick. > > PEP: 403 > Title: Statement local classes and functions > Version: $Revision$ > Last-Modified: $Date$ > Author: Nick Coghlan > Status: Deferred > Type: Standards Track > Content-Type: text/x-rst > Created: 2011-10-13 > Python-Version: 3.x > Post-History: 2011-10-13 > Resolution: TBD > > > Abstract > ======== > > This PEP proposes the addition of ':' as a new class and function prefix > syntax (analogous to decorators) that permits a statement local function or > class definition to be appended to any Python statement that currently does > not have an associated suite. > > In addition, the new syntax would allow the '@' symbol to be used to refer > to the statement local function or class without needing to repeat the name. > > When the ':' prefix syntax is used, the associated statement would be executed > *instead of* the normal local name binding currently implicit in function > and class definitions. > > This PEP is based heavily on many of the ideas in PEP 3150 (Statement Local > Namespaces) so some elements of the rationale will be familiar to readers of > that PEP. That PEP has now been withdrawn in favour of this one. > > > PEP Deferral > ============ > > Like PEP 3150, this PEP currently exists in a deferred state. Unlike PEP 3150, > this isn't because I suspect it might be a terrible idea or see nasty problems > lurking in the implementation (aside from one potential parsing issue). > > Instead, it's because I think fleshing out the concept, exploring syntax > variants, creating a reference implementation and generally championing > the idea is going to require more time than I can give it in the 3.3 time > frame. > > So, it's deferred. If anyone wants to step forward to drive the PEP for 3.3, > let me know and I can add you as co-author and move it to Draft status. > > > Basic Examples > ============== > > Before diving into the long history of this problem and the detailed > rationale for this specific proposed solution, here are a few simple > examples of the kind of code it is designed to simplify. > > As a trivial example, weakref callbacks could be defined as follows:: > > ? ?:x = weakref.ref(obj, @) > ? ?def report_destruction(obj): > ? ? ? ?print("{} is being destroyed".format(obj)) > > This contrasts with the current repetitive "out of order" syntax for this > operation:: > > ? ?def report_destruction(obj): > ? ? ? ?print("{} is being destroyed".format(obj)) > > ? ?x = weakref.ref(obj, report_destruction) > > That structure is OK when you're using the callable multiple times, but > it's irritating to be forced into it for one-off operations. > > Similarly, singleton classes could now be defined as:: > > ?:instance = @() > ?class OnlyOneInstance: > ? ?pass > > Rather than:: > > ?class OnlyOneInstance: > ? ?pass > > ?instance = OnlyOneInstance() > > And the infamous accumulator example could become:: > > ? ?def counter(): > ? ? ? ?x = 0 > ? ? ? ?:return @ > ? ? ? ?def increment(): > ? ? ? ? ? ?nonlocal x > ? ? ? ? ? ?x += 1 > ? ? ? ? ? ?return x > > Proposal > ======== > > This PEP proposes the addition of an optional block prefix clause to the > syntax for function and class definitions. > > This block prefix would be introduced by a leading ``:`` and would be > allowed to contain any simple statement (including those that don't > make any sense in that context - while such code would be legal, > there wouldn't be any point in writing it). > > The decorator symbol ``@`` would be repurposed inside the block prefix > to refer to the function or class being defined. > > When a block prefix is provided, it *replaces* the standard local > name binding otherwise implicit in a class or function definition. > > > Background > ========== > > The question of "multi-line lambdas" has been a vexing one for many > Python users for a very long time, and it took an exploration of Ruby's > block functionality for me to finally understand why this bugs people > so much: Python's demand that the function be named and introduced > before the operation that needs it breaks the developer's flow of thought. > They get to a point where they go "I need a one-shot operation that does > ", and instead of being able to just *say* that, they instead have to back > up, name a function to do , then call that function from the operation > they actually wanted to do in the first place. Lambda expressions can help > sometimes, but they're no substitute for being able to use a full suite. > > Ruby's block syntax also heavily inspired the style of the solution in this > PEP, by making it clear that even when limited to *one* anonymous function per > statement, anonymous functions could still be incredibly useful. Consider how > many constructs Python has where one expression is responsible for the bulk of > the heavy lifting: > > ?* comprehensions, generator expressions, map(), filter() > ?* key arguments to sorted(), min(), max() > ?* partial function application > ?* provision of callbacks (e.g. for weak references) > ?* array broadcast operations in NumPy > > However, adopting Ruby's block syntax directly won't work for Python, since > the effectiveness of Ruby's blocks relies heavily on various conventions in > the way functions are *defined* (specifically, Ruby's ``yield`` syntax to > call blocks directly and the ``&arg`` mechanism to accept a block as a > functions final argument. > > Since Python has relied on named functions for so long, the signatures of > APIs that accept callbacks are far more diverse, thus requiring a solution > that allows anonymous functions to be slotted in at the appropriate location. > > > Relation to PEP 3150 > ==================== > > PEP 3150 (Statement Local Namespaces) described its primary motivation > as being to elevate ordinary assignment statements to be on par with ``class`` > and ``def`` statements where the name of the item to be defined is presented > to the reader in advance of the details of how the value of that item is > calculated. This PEP achieves the same goal in a different way, by allowing > the simple name binding of a standard function definition to be replaced > with something else (like assigning the result of the function to a value). > > This PEP also achieves most of the other effects described in PEP 3150 > without introducing a new brainbending kind of scope. All of the complex > scoping rules in PEP 3150 are replaced in this PEP with the simple ``@`` > reference to the statement local function or class definition. > > > Symbol Choice > ============== > > The ':' symbol was chosen due to its existing presence in Python and its > association with 'functions in expressions' via ``lambda`` expressions. The > past Simple Implicit Lambda proposal (PEP ???) was also a factor. > > The proposal definitely requires *some* kind of prefix to avoid parsing > ambiguity and backwards compatibility problems and ':' at least has the > virtue of brevity. There's no obious alternative symbol that offers a > clear improvement. > > Introducing a new keyword is another possibility, but I haven't come up > with one that really has anything to offer over the leading colon. > > > Syntax Change > ============= > > Current:: > > ? ?atom: ('(' [yield_expr|testlist_comp] ')' | > ? ? ? ? ? '[' [testlist_comp] ']' | > ? ? ? ? ? '{' [dictorsetmaker] '}' | > ? ? ? ? ? NAME | NUMBER | STRING+ | '...' | 'None' | 'True' | 'False') > > Changed:: > > ? ?atom: ('(' [yield_expr|testlist_comp] ')' | > ? ? ? ? ? '[' [testlist_comp] ']' | > ? ? ? ? ? '{' [dictorsetmaker] '}' | > ? ? ? ? ? NAME | NUMBER | STRING+ | '...' | 'None' | 'True' | 'False' | '@') > > New:: > > ? ?blockprefix: ':' simple_stmt > ? ?block: blockprefix (decorated | classdef | funcdef) > > The above is the general idea, but I suspect that change to the 'atom' > definition would cause an ambiguity problem in the parser when it comes to > detecting decorator lines. So the actual implementation would be more complex > than that. > > Grammar: http://hg.python.org/cpython/file/default/Grammar/Grammar > > > Possible Implementation Strategy > ================================ > > This proposal has one titanic advantage over PEP 3150: implementation > should be relatively straightforward. > > Both the class and function definition statements emit code to perform > the local name binding for their defined name. Implementing this PEP > should just require intercepting that code generation and replacing > it with the code in the block prefix. > > The one potentially tricky part is working out how to allow the dual > use of '@' without rewriting half the grammar definition. > > More Examples > ============= > > Calculating attributes without polluting the local namespace (from os.py):: > > ?# Current Python (manual namespace cleanup) > ?def _createenviron(): > ? ? ?... # 27 line function > > ?environ = _createenviron() > ?del _createenviron > > ?# Becomes: > ?:environ = @() > ?def _createenviron(): > ? ? ?... # 27 line function > > Loop early binding:: > > ?# Current Python (default argument hack) > ?funcs = [(lambda x, i=i: x + i) for i in range(10)] > > ?# Becomes: > ?:funcs = [@(i) for i in range(10)] > ?def make_incrementor(i): > ? ?return lambda x: x + i > > ?# Or even: > ?:funcs = [@(i) for i in range(10)] > ?def make_incrementor(i): > ? ?:return @ > ? ?def incrementor(x): > ? ? ? ?return x + i > > > Reference Implementation > ======================== > > None as yet. > > > TO DO > ===== > > Sort out links and references to everything :) > > > Acknowledgements > ================ > > Huge thanks to Gary Bernhardt for being blunt in pointing out that I had no > idea what I was talking about in criticising Ruby's blocks, kicking off a > rather enlightening process of investigation. > > > References > ========== > > TBD > > > Copyright > ========= > > This document has been placed in the public domain. > > > .. > ? Local Variables: > ? mode: indented-text > ? indent-tabs-mode: nil > ? sentence-end-double-space: t > ? fill-column: 70 > ? coding: utf-8 > ? End: > > > -- > Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > This is really clever. I have some comments: 1. decorators just to clarify, this is legal: :assert @(1) == 1 @lambda f: return lambda x: return x def spam(): pass but decorators above the block prefix are not (since they wouldn't be decorating a def/class statement). 2. namespaces I hate using a class definition as a plain namespace, but the following is cool (if I'm reading this right): :some_func(@.x, @.y) class _: x = 4 y = 1 and given a purer namespace (like http://code.activestate.com/recipes/577887): :some_func(**@) @as_namespace class _: x = 4 y = 1 3. does it cover all the default arguments hack use cases? If so, is it too cumbersome to be a replacement? 4. how do you introspect the statement local function/class? 5. the relationship to statement local namespaces makes me think def-from and assignment decorators, but this seems like something else. Some of these are probably pretty obvious, but it's getting late and my brain's a little fuzzy. :) I'll think about these some more tomorrow. All in all, this is a pretty sweet idea! -eric From ncoghlan at gmail.com Thu Oct 13 10:14:22 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 13 Oct 2011 18:14:22 +1000 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <2CEE38E9-AFC5-440F-B6EB-87D0403114AC@gmail.com> Message-ID: On Thu, Oct 13, 2011 at 4:37 PM, Eric Snow wrote: > On Thu, Oct 13, 2011 at 12:30 AM, Carl M. Johnson > wrote: >> >> On Oct 12, 2011, at 8:24 PM, Ka-Ping Yee wrote: >> >>> I don't see why the leading colon is necessary, actually. ?Isn't >>> the presence of a bare @ enough to signal that a function or class >>> definition must come next? >> >> Python's parser is purposefully pretty simple, so it's not clear if that could be added without ripping up the whole language definition. > > and visually the colon _does_ make the syntax clearer. ?It works > because it's so out of place at the beginning of a line. ?Scanning > through code you won't miss it (and misinterpret what's going on). Yeah, the block prefix needs something to mark it as special, because it *is* special (even more so than decorators). Decorator expressions are at least evaluated in the order they're encountered in the source - it is only the resulting decorators that are saved and invoked later (after the function has been defined). These block prefix lines would be different, they'd only be evaluated *after* the function was already defined and any decorators applied. We could probably make the parsing unambiguous without a prefix syntax if we really wanted to (especially if we used a different symbol to reference the object being defined), but the out of order execution of the associated statement makes that a questionable idea. That's basically why I went with the leading ':' - it's jarring enough to get your attention, without being so ugly as to be intolerable. However, Georg's "this doesn't look like Python any more" criticism has serious merit - while style guidelines can mitigate that to some degree (just as they advise against gratuitous use of lambda expressions when a named function would be better), there's an inherent ugliness to the syntax in the first draft of the PEP that may make it irredeemable. I think PEP 403 in its current form is most useful as a statement of intent of the kind of code we want to be able factor cleanly. The decorator syntax handles code specifically of the form: def|class NAME HEADER: BODY NAME = decorator(NAME) PEP 403 instead sets out to handle the case of: def NAME HEADER: BODY # operation involving NAME del NAME (unless the operation was an explicit assignment to NAME, in which case we leave it alone) The key point is that we don't really *care* about the function being defined - we really only care about the operation we need it for (such as using it as a callback or as a key function or returning it or yielding it or even calling it and doing something with the result). It's essentially a one shot operation - it may be *invoked* more than once, but we're defining it for the purposes of getting someone else to run a chunk of our code, *not* for the purpose of explicitly invoking that operation in multiple places in the application's source code. Hopefully at this point those that have wished for and argued in favour of multi-line lambdas over the years are nodding their heads in agreement - I think I finally get what they've been complaining about for so long, and PEP 403 is the result. If it doesn't address at least a significant fraction of their use cases, then we need to know. I already think there are two simplifying assumptions that should be made at least for the first iteration of the idea: 1. Leave classes out of it, at least for now. We did that with decorators, and I think it's a reasonable approach to follow. 2. The initial version should be an alternative to decorator syntax, not an addition to it. That is, you wouldn't be able to mix the first incarnation of a block prefix with ordinary decorators. If we can find an approach that works for the basic case (i.e. naked function definition) and doesn't make people recoil in horror (esp. Guido), then we can look at expanding back to cover these two cases. I'll note that the only further syntax idea I've had is to replace the ':' with 'postdef' and the '@' with 'def', so some of the examples kicking around would become: postdef x = weakref.ref(obj, def) def report_destruction(obj): print("{} is being destroyed".format(obj)) postdef funcs = [def(i) for i in range(10)] def make_incrementor(i): postdef return def def incrementor(x): return x + i postdef sorted_list = sorted(original, key=def) def normalise(item): ? That actually looks quite readable to me, and is fairly explicit about what it does: here's a piece of code to run after the following function has been defined. I definitely like it better than what I have in the PEP. With this variant, I would suggest that any postdef clause be executed *in addition* to the normal name binding. Colliding on dummy names like "func" would then be like colliding on loop variables like "i" - typically harmless, because you don't use the names outside the constructs that define them anyway. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From cmjohnson.mailinglist at gmail.com Thu Oct 13 10:28:01 2011 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Wed, 12 Oct 2011 22:28:01 -1000 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: Message-ID: <93166BC1-F0A0-4654-9C3B-10F9588C37FA@gmail.com> On Oct 12, 2011, at 9:32 PM, Eric Snow wrote: > > 1. decorators > > just to clarify, this is legal: > > :assert @(1) == 1 > @lambda f: return lambda x: return x > def spam(): pass Nitpicker's note: lambda as a decorator isn't legal today (although I argued in the past maybe it should be), and I don't think PEP 403 would change it, so that won't work. Nevertheless, this would be legal if I understand correctly: _ = lambda arg: arg :assert @(1) == 1 @_(lambda f: return lambda x: return x) def spam(): pass From greg.ewing at canterbury.ac.nz Thu Oct 13 08:39:58 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 13 Oct 2011 19:39:58 +1300 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <20111012203326.GE6393@pantoffel-wg.de> <4E9602BC.40206@stoneleaf.us> Message-ID: <4E9687BE.9040204@canterbury.ac.nz> Guido van Rossum wrote: > Traceback (most recent call last): > File "", line 1, in > TypeError: 'Decimal' object cannot be interpreted as an integer > >>>>range(int(Decimal(10))) > > range(0, 10) It refuses to work on floats, too, even if they happen to have integer values: >>> range(1.0, 10.0) Traceback (most recent call last): File "", line 1, in TypeError: 'float' object cannot be interpreted as an integer If we think that's a good idea, presumably the same thing ought to apply to Decimals. Or are Decimals supposed to be more "exact" than floats somehow? -- Greg From aaron.devore at gmail.com Thu Oct 13 11:02:45 2011 From: aaron.devore at gmail.com (Aaron DeVore) Date: Thu, 13 Oct 2011 02:02:45 -0700 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: Message-ID: On Thu, Oct 13, 2011 at 12:32 AM, Eric Snow wrote: > 2. namespaces > > I hate using a class definition as a plain namespace, but the > following is cool (if I'm reading this right): > > ? :some_func(@.x, @.y) > ? class _: > ? ? ? x = 4 > ? ? ? y = 1 > > and given a purer namespace (like http://code.activestate.com/recipes/577887): > > ? :some_func(**@) > ? @as_namespace > ? class _: > ? ? ? x = 4 > ? ? ? y = 1 A clean way to have multiple function could be an issue. Perhaps add a syntax to refer to either the first function (@) or a named function(@x)? I can't think of a great syntax to group the function definitions, though. > 1. Leave classes out of it, at least for now. We did that with > decorators, and I think it's a reasonable approach to follow. -1. This sounds useful for classes. I'm not sure what, but it still sounds useful. > 2. The initial version should be an alternative to decorator syntax, > not an addition to it. That is, you wouldn't be able to mix the first > incarnation of a block prefix with ordinary decorators. That would kill off usage of some handy decorators like functools.wraps: :some_func(@) @wraps(other_func) def f(b): # function body -Aaron DeVore From p.f.moore at gmail.com Thu Oct 13 11:10:08 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 13 Oct 2011 10:10:08 +0100 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <2CEE38E9-AFC5-440F-B6EB-87D0403114AC@gmail.com> Message-ID: On 13 October 2011 09:14, Nick Coghlan wrote: > That's basically why I went with the leading ':' - it's jarring enough > to get your attention, without being so ugly as to be intolerable. > However, Georg's "this doesn't look like Python any more" criticism > has serious merit - while style guidelines can mitigate that to some > degree (just as they advise against gratuitous use of lambda > expressions when a named function would be better), there's an > inherent ugliness to the syntax in the first draft of the PEP that may > make it irredeemable. Personally, I find the leading ":" too light (to my ageing eyes :-)) so that it gets lost. Also, I am now trained by decorators to see lines starting with @ as "attached" to the following definition, in a way that other syntax isn't. As an alternative bikeshed colour, would @: work rather than plain @? > I already think there are two simplifying assumptions that should be > made at least for the first iteration of the idea: > > 1. Leave classes out of it, at least for now. We did that with > decorators, and I think it's a reasonable approach to follow. While I don't disagree per se, I suspect that statement-local classes will be on the enhancement list from day 1 - the trick of using a class as a local namespace is just too compelling. So deferring that option may be a false economy. > 2. The initial version should be an alternative to decorator syntax, > not an addition to it. That is, you wouldn't be able to mix the first > incarnation of a block prefix with ordinary decorators. Agreed, and with this one I'm not sure it shouldn't stay a limitation forever. Mixing the two seems like a step too far. (And if you really need it, just call the decorator directly as part of the @: statement). > If we can find an approach that works for the basic case (i.e. naked > function definition) and doesn't make people recoil in horror (esp. > Guido), then we can look at expanding back to cover these two cases. > > I'll note that the only further syntax idea I've had is to replace the > ':' with 'postdef' and the '@' with 'def', so some of the examples > kicking around would become: > > ? ?postdef x = weakref.ref(obj, def) > ? ?def report_destruction(obj): > ? ? ? ?print("{} is being destroyed".format(obj)) > > ? ?postdef funcs = [def(i) for i in range(10)] > ? ?def make_incrementor(i): > ? ? ? ?postdef return def > ? ? ? ?def incrementor(x): > ? ? ? ? ? ?return x + i > > ? ?postdef sorted_list = sorted(original, key=def) > ? ?def normalise(item): > ? ? ? ?? > > That actually looks quite readable to me, and is fairly explicit about > what it does: here's a piece of code to run after the following > function has been defined. I definitely like it better than what I > have in the PEP. My instinct still prefers a form including a leading @, but I definitely like this better than the bare colon. And given that the "@ attaches to the next statement" instinct is learned behaviour, I'm sure I could learn to recognise something like this just as easily. I *don't* like using def to mark the placeholder, though - too easily lost. > With this variant, I would suggest that any postdef clause be executed > *in addition* to the normal name binding. Colliding on dummy names > like "func" would then be like colliding on loop variables like "i" - > typically harmless, because you don't use the names outside the > constructs that define them anyway. I'm not sure why you feel that using a keyword implies that the binding behaviour should change - but as you say typically it's not likely to matter. Some other minor comments: - I am +1 on this idea, but don't really have any particular use cases so that's on a purely theoretical basis. - The parallel with decorators makes this much easier to integrate mentally than PEP 3150. - Using _ as a throwaway name seems fine to me, and means there's little need to worry about allowing the name to be omitted (although I do think the unused name-as-documentation-only is a little jarring) - The @ as reference to the local function is mildly ugly, in a way that @ at the start of a line isn't. But I don't have any really good alternatives to offer (other than maybe a special name with double underscores, such as __this__, but I'm not convinced by that - hmm, what about a bare double underscore, __? Too "cute"?). - Georg's point about this not looking like Python any more is good. I don't completely agree, and in particular I think that the semantics are sufficiently Pythonic, it's just the syntax that is jarring, but it *is* something to be careful of. Decorators felt much the same when they were introduced, though... Paul. From solipsis at pitrou.net Thu Oct 13 13:45:30 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 13 Oct 2011 13:45:30 +0200 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) References: Message-ID: <20111013134530.5b539ca0@pitrou.net> On Thu, 13 Oct 2011 09:31:35 +0200 Georg Brandl wrote: > Am 13.10.2011 07:48, schrieb Nick Coghlan: > > > Oh, I'll also note that the class variant gives you the full power of > > PEP 3150 without any (especially) funky new namespace semantics: > > > > :x = property(@.get, @.set, @.delete) > > class scope: > > def get(self): > > return __class__.attr > > def set(self, val): > > __class__.attr = val > > def delete(self): > > del __class__.attr > > Sorry, I don't think this looks like Python anymore. Defining a class > just to get at a throwaway namespace? Using "@" as an identifier? > Using ":" not as a suite marker? > > This doesn't have any way for a casual reader to understand what's > going on. Same here. This is very cryptic to me. (while e.g. Javascript anonymous functions are quite easy to read) cheers Antoine. From ncoghlan at gmail.com Thu Oct 13 13:50:13 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 13 Oct 2011 21:50:13 +1000 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <2CEE38E9-AFC5-440F-B6EB-87D0403114AC@gmail.com> Message-ID: On Thu, Oct 13, 2011 at 7:10 PM, Paul Moore wrote: > On 13 October 2011 09:14, Nick Coghlan wrote: >> That's basically why I went with the leading ':' - it's jarring enough >> to get your attention, without being so ugly as to be intolerable. >> However, Georg's "this doesn't look like Python any more" criticism >> has serious merit - while style guidelines can mitigate that to some >> degree (just as they advise against gratuitous use of lambda >> expressions when a named function would be better), there's an >> inherent ugliness to the syntax in the first draft of the PEP that may >> make it irredeemable. > > Personally, I find the leading ":" too light (to my ageing eyes :-)) > so that it gets lost. Also, I am now trained by decorators to see > lines starting with @ as "attached" to the following definition, in a > way that other syntax isn't. > > As an alternative bikeshed colour, would @: work rather than plain @? I'm much happier with the keyword - "postdef" provides a strong hint to what it does, and also ties it in with the following function definition. >> I already think there are two simplifying assumptions that should be >> made at least for the first iteration of the idea: >> >> 1. Leave classes out of it, at least for now. We did that with >> decorators, and I think it's a reasonable approach to follow. > > While I don't disagree per se, I suspect that statement-local classes > will be on the enhancement list from day 1 - the trick of using a > class as a local namespace is just too compelling. So deferring that > option may be a false economy. It doesn't make a huge difference, as 'postdef' also works sufficiently well as the keyword for classes as well (since the class statement if more commonly known as a class definition). Including classes may nudge the forward reference syntax back towards the semantically neutral '@', though. The alternative would be to use 'def' for functions and 'class' for classes, which would be a little ugly. >> 2. The initial version should be an alternative to decorator syntax, >> not an addition to it. That is, you wouldn't be able to mix the first >> incarnation of a block prefix with ordinary decorators. > > Agreed, and with this one I'm not sure it shouldn't stay a limitation > forever. Mixing the two seems like a step too far. (And if you really > need it, just call the decorator directly as part of the @: > statement). With an explicit keyword, mixing them may be OK, though. For decorator factories, the explicit calling syntax is a little clumsy. >> That actually looks quite readable to me, and is fairly explicit about >> what it does: here's a piece of code to run after the following >> function has been defined. I definitely like it better than what I >> have in the PEP. > > My instinct still prefers a form including a leading @, but I > definitely like this better than the bare colon. And given that the "@ > attaches to the next statement" instinct is learned behaviour, I'm > sure I could learn to recognise something like this just as easily. I > *don't* like using def to mark the placeholder, though - too easily > lost. It's easier to see with syntax highlighting, but yeah, I'm not as sold on the idea of using def for forward reference rather than '@' as I am on the switch from a bare colon to the postdef keyword. >> With this variant, I would suggest that any postdef clause be executed >> *in addition* to the normal name binding. Colliding on dummy names >> like "func" would then be like colliding on loop variables like "i" - >> typically harmless, because you don't use the names outside the >> constructs that define them anyway. > > I'm not sure why you feel that using a keyword implies that the > binding behaviour should change - but as you say typically it's not > likely to matter. It's actually more that I wasn't entirely comfortable with suppressing the name binding in the first place and changing to a keyword really emphasised the "do the function definition as normal, but then run this extra piece of code afterwards" aspect. Exposing the function names by default can also help with testability of code that overuses the new construct. > - The @ as reference to the local function is mildly ugly, in a way > that @ at the start of a line isn't. But I don't have any really good > alternatives to offer (other than maybe a special name with double > underscores, such as __this__, but I'm not convinced by that - hmm, > what about a bare double underscore, __? Too "cute"?). No, I think we want a symbol or a real keyword here. Getting too cute with names may actually make it harder to implement and understand rather than easier. > - Georg's point about this not looking like Python any more is good. I > don't completely agree, and in particular I think that the semantics > are sufficiently Pythonic, it's just the syntax that is jarring, but > it *is* something to be careful of. Decorators felt much the same when > they were introduced, though... The leading colon to introduce the new clause was definitely far too cryptic, so I'm a lot happier with the explicit 'postdef' keyword idea. For the rest, I think it falls into the same category as lambda abuse - overusing the post definition clause would be a code smell, to be fought by the forces of style guides, code reviews and an emphasis on writing testable code. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu Oct 13 13:55:23 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 13 Oct 2011 21:55:23 +1000 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: <20111013134530.5b539ca0@pitrou.net> References: <20111013134530.5b539ca0@pitrou.net> Message-ID: On Thu, Oct 13, 2011 at 9:45 PM, Antoine Pitrou wrote: > On Thu, 13 Oct 2011 09:31:35 +0200 > Georg Brandl wrote: >> Am 13.10.2011 07:48, schrieb Nick Coghlan: >> >> > Oh, I'll also note that the class variant gives you the full power of >> > PEP 3150 without any (especially) funky new namespace semantics: >> > >> > ? ? :x = property(@.get, @.set, @.delete) >> > ? ? class scope: >> > ? ? ? ? def get(self): >> > ? ? ? ? ? ? return __class__.attr >> > ? ? ? ? def set(self, val): >> > ? ? ? ? ? ? __class__.attr = val >> > ? ? ? ? def delete(self): >> > ? ? ? ? ? ? del __class__.attr >> >> Sorry, I don't think this looks like Python anymore. ?Defining a class >> just to get at a throwaway namespace? ?Using "@" as an identifier? >> Using ":" not as a suite marker? >> >> This doesn't have any way for a casual reader to understand what's >> going on. > > Same here. This is very cryptic to me. > (while e.g. Javascript anonymous functions are quite easy to read) The update to the PEP that I just pushed actually drops class statement support altogether (at least for now), but if it was still there, the above example would instead look more like: postdef x = property(class.get, class.set, class.delete) class scope: def get(self): return __class__.attr def set(self, val): __class__.attr = val def delete(self): del __class__.attr I think I was getting too cute and it's a bad example, though - there's a reason I've now dropped classes from the initial scope of the proposal (just like the original decorator PEP). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From cmjohnson.mailinglist at gmail.com Thu Oct 13 14:02:12 2011 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Thu, 13 Oct 2011 02:02:12 -1000 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <2CEE38E9-AFC5-440F-B6EB-87D0403114AC@gmail.com> Message-ID: <02AC180C-CAAC-4A60-86BE-B901F2618A67@gmail.com> On Oct 13, 2011, at 1:50 AM, Nick Coghlan wrote: > I'm much happier with the keyword - "postdef" provides a strong hint > to what it does, and also ties it in with the following function > definition. My 2 cents: postdef and @ seem to be a good pair, unless someone has a better name. I would also be surprised if "postdef" as a keyword broke very much code. From ncoghlan at gmail.com Thu Oct 13 14:06:00 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 13 Oct 2011 22:06:00 +1000 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: Message-ID: On Thu, Oct 13, 2011 at 7:02 PM, Aaron DeVore wrote: > A clean way to have multiple function could be an issue. Perhaps add a > syntax to refer to either the first function (@) or a named > function(@x)? I can't think of a great syntax to group the function > definitions, though. One of the key things I realised after reflecting on both Ruby's blocks and Python's own property descriptor is that it's *OK* to only support a single function in this construct. There's a wide range of common use cases where the ability to provide a full capability one shot function would be quite helpful, especially when it comes to callback based programming. Even in threaded programming, giving each thread it's own copy of a function may be a convenient alternative to using synchronisation locks. Actually, here's an interesting example based on quickly firing up a worker thread: postdef t = threading.Thread(target=def); t.start() def pointless(): """A pointless worker thread that does nothing except serve as an example""" print("Starting") time.sleep(3) print("Ending") >> 1. Leave classes out of it, at least for now. We did that with >> decorators, and I think it's a reasonable approach to follow. > > -1. This sounds useful for classes. I'm not sure what, but it still > sounds useful. We need more than that to justify keeping classes in the mix - if we can't think of clear and compelling use cases, then it's better to omit the feature for now and see if we really want to include it later. >> 2. The initial version should be an alternative to decorator syntax, >> not an addition to it. That is, you wouldn't be able to mix the first >> incarnation of a block prefix with ordinary decorators. > > That would kill off usage of some handy decorators like functools.wraps: > > ?:some_func(@) > ?@wraps(other_func) > ?def f(b): > ? ? ?# function body No it wouldn't, they'd just need to be called explicitly: postdef some_func(wraps(other_func)(def)) def f(b): # function body As with classes, I'm tempted to call "YAGNI" (You Ain't Gonna Need It) on this aspect of the initial specification. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu Oct 13 14:10:17 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 13 Oct 2011 22:10:17 +1000 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: <02AC180C-CAAC-4A60-86BE-B901F2618A67@gmail.com> References: <2CEE38E9-AFC5-440F-B6EB-87D0403114AC@gmail.com> <02AC180C-CAAC-4A60-86BE-B901F2618A67@gmail.com> Message-ID: On Thu, Oct 13, 2011 at 10:02 PM, Carl M. Johnson wrote: > > On Oct 13, 2011, at 1:50 AM, Nick Coghlan wrote: > >> I'm much happier with the keyword - "postdef" provides a strong hint >> to what it does, and also ties it in with the following function >> definition. > > My 2 cents: postdef and @ seem to be a good pair, unless someone has a better name. I also think that could work, but I do find 'def' an appealing alternative to '@' because it's a much better intuition pump that the thing it refers to is the object currently being defined. The '@', however, wouldn't be hard to remember once you knew about it, works nicely for both functions and classes, stands out visually even without syntax highlighting and is hard to beat for brevity. I suspect this is a case where a bigger set of example use cases and lining up the two alternatives for each one would be useful. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From sven at marnach.net Thu Oct 13 14:27:06 2011 From: sven at marnach.net (Sven Marnach) Date: Thu, 13 Oct 2011 13:27:06 +0100 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> Message-ID: <20111013122706.GH6393@pantoffel-wg.de> Nick Coghlan schrieb am Do, 13. Okt 2011, um 11:43:53 +1000: > I'll weigh in on the "compare like a sequence" side, even if the > specific range definitions are different. It's the way range > comparisons work in Python 2 and I'd like range() objects to be as > close to a computationally defined immutable list as we can get them. > It may even make sense to make them hashable in those terms. The current interface of ranges is that of a sequence and nothing more. The only place where the original parameters that were used to create the sequence pop up is in its representation. Giving those parameters more weight seems a bit like remembering the list coomprehension that was used to create a list and take this into account when comparing lists (of course I'm exaggerating here to make a point). > But not exposing start/stop/step is a definite oversight - I actually > thought we *did* expose them, but I was thinking of slice objects. > With those attributes exposed, anyone that wants a more restrictive > form of equality can easily implement it for themselves. The current implementation of range objects could be both made considerably shorter and sped up a bit by normalising the parameters right from the start. I'd argue that doing so would be a good idea even when start, stop and step are exposed. It would make clear what is happening and remove any ambiguity as to the semantics of range objects in general and comparison of range objects in particular. (A simple implementation is more often than not a good indication for having arrived at the right notions.) Note that using '[:]' does not completely normalise a range object. It only normalises the stop parameter. Cheers, Sven From shibturn at gmail.com Thu Oct 13 15:18:48 2011 From: shibturn at gmail.com (shibturn) Date: Thu, 13 Oct 2011 14:18:48 +0100 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: Message-ID: On 13/10/2011 8:32am, Eric Snow wrote: > 2. namespaces > > I hate using a class definition as a plain namespace, but the > following is cool (if I'm reading this right): > > :some_func(@.x, @.y) > class _: > x = 4 > y = 1 > > and given a purer namespace (like http://code.activestate.com/recipes/577887): > > :some_func(**@) > @as_namespace > class _: > x = 4 > y = 1 I would rather ressurect PEP 3150 but make STATEMENT_OR_EXPRESSION given: SUITE equivalent to something like def _anon(): SUITE return AtObject(locals()) @ = _anon() STATEMENT_OR_EXPRESSION del @ where AtObject is perhaps defined as class AtObject(object): def __init__(self, d): self.__dict__.update(d) Then you could do some_func(@.x, @.y) given: x = 4 y = 1 and x = property(@.get, @.set) given: def get(self): ... def set(self, value): ... Wouldn't this be cleaner/simpler to implement than the old PEP 3150? Certainly prettier to my eyes than the new proposal -- I want my subordinate definitions indented. Cheers, sbt From p.f.moore at gmail.com Thu Oct 13 15:38:30 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 13 Oct 2011 14:38:30 +0100 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: Message-ID: On 13 October 2011 13:06, Nick Coghlan wrote: > Actually, here's an interesting example based on quickly firing up a > worker thread: > > postdef t = threading.Thread(target=def); t.start() > def pointless(): > ? ?"""A pointless worker thread that does nothing except serve as an example""" > ? ?print("Starting") > ? ?time.sleep(3) > ? ?print("Ending") You used 2 statements on the postdef line. That's not in the PEP (the original, I haven't read your revisons yet) and is probably fairly difficult to implement as well. Paul. From solipsis at pitrou.net Thu Oct 13 15:41:45 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 13 Oct 2011 15:41:45 +0200 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) References: Message-ID: <20111013154145.7ac1a302@pitrou.net> On Thu, 13 Oct 2011 22:06:00 +1000 Nick Coghlan wrote: > Actually, here's an interesting example based on quickly firing up a > worker thread: > > postdef t = threading.Thread(target=def); t.start() > def pointless(): > """A pointless worker thread that does nothing except serve as an example""" > print("Starting") > time.sleep(3) > print("Ending") I think the problem is still that the syntax isn't nice or obvious. Until the syntax is nice and obvious, I don't think there's any point adding it. (by contrast, decorators *are* nice and obvious: writing "@classmethod" before a method was tons better than the previous redundant idiom) Regards Antoine. From solipsis at pitrou.net Thu Oct 13 15:51:10 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 13 Oct 2011 15:51:10 +0200 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) References: <20111013154145.7ac1a302@pitrou.net> Message-ID: <20111013155110.69858670@pitrou.net> On Thu, 13 Oct 2011 15:41:45 +0200 Antoine Pitrou wrote: > On Thu, 13 Oct 2011 22:06:00 +1000 > Nick Coghlan wrote: > > Actually, here's an interesting example based on quickly firing up a > > worker thread: > > > > postdef t = threading.Thread(target=def); t.start() > > def pointless(): > > """A pointless worker thread that does nothing except serve as an example""" > > print("Starting") > > time.sleep(3) > > print("Ending") > > I think the problem is still that the syntax isn't nice or obvious. > Until the syntax is nice and obvious, I don't think there's any point > adding it. > (by contrast, decorators *are* nice and obvious: writing "@classmethod" > before a method was tons better than the previous redundant idiom) So how about re-using the "lambda" keyword followed by the anonymous function's parameters (if any)? x = weakref.ref(target, lambda obj): print("{} is being destroyed".format(obj)) t = threading.Thread(target=lambda): """A pointless worker thread that does nothing except serve as an example""" print("Starting") time.sleep(3) print("Ending") t.start() From jimjjewett at gmail.com Thu Oct 13 16:06:32 2011 From: jimjjewett at gmail.com (Jim Jewett) Date: Thu, 13 Oct 2011 10:06:32 -0400 Subject: [Python-ideas] If I had the time machine for comparisons Message-ID: On Wed, Oct 12, 2011 at 11:18 PM, Guido van Rossum wrote: > (I'm also -1 on adding ordering comparisons; there's little > disagreement on that issue.) If I had a time machine, I would allow comparisons to return "unordered" as well. Right now, objects are comparable or not based strictly on the type, even though comparison is inherently about the values. I think that range(3) < range(10) is obviously true, even though it isn't clear whether or not range(3, 15, 2) < range(7, -8, -1) is true. Here we err by not allowing the first comparison; other objects (like dicts) we err by forcing an arbitrary ordering. -jJ From jimjjewett at gmail.com Thu Oct 13 16:12:08 2011 From: jimjjewett at gmail.com (Jim Jewett) Date: Thu, 13 Oct 2011 10:12:08 -0400 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <20111012203326.GE6393@pantoffel-wg.de> <4E9602BC.40206@stoneleaf.us> Message-ID: On Wed, Oct 12, 2011 at 11:18 PM, Guido van Rossum wrote: > The open question so far is: How do we want our ranges to work? My > intuition is weak, but says: range(0) != range(1, 1) != range(1, 1, 2) > and range(0, 10, 2) != range(0, 11, 2); all because the arguments > (after filling in the defaults) are different, and those arguments can > come out using the start, stop, step attributes (once we implement > them :-). If range were a normal function, we would compare the output of the function call without worrying about what the inputs were. If range were tuple, then the exact inputs would matter, but it isn't. The few times I wanted to compare ranges, I cared what sequence they produced, and *wanted* it to normalize out the arguments for me. That said, it was years ago, and I can't even remember whether or not I was working on a "real-world" problem at the time. -jJ From guido at python.org Thu Oct 13 16:27:40 2011 From: guido at python.org (Guido van Rossum) Date: Thu, 13 Oct 2011 07:27:40 -0700 Subject: [Python-ideas] If I had the time machine for comparisons In-Reply-To: References: Message-ID: On Thu, Oct 13, 2011 at 7:06 AM, Jim Jewett wrote: > On Wed, Oct 12, 2011 at 11:18 PM, Guido van Rossum wrote: > >> (I'm also -1 on adding ordering comparisons; there's little >> disagreement on that issue.) > > If I had a time machine, I would allow comparisons to return > "unordered" as well. ?Right now, objects are comparable or not based > strictly on the type, even though comparison is inherently about the > values. > > I think that > > ? ?range(3) < range(10) > > is obviously true, even though it isn't clear whether or not > > ? ?range(3, 15, 2) < range(7, -8, -1) > > is true. > > Here we err by not allowing the first comparison; other objects (like > dicts) we err by forcing an arbitrary ordering. Have you used Python3 lately? It doesn't allow dict ordering. In general Python expresses unordered by raising an exception (often TypeError). Even though range(1) is "obviously" < range(2), there are so many unobvious cases that supporting this one special case isn't worth it. -- --Guido van Rossum (python.org/~guido) From solipsis at pitrou.net Thu Oct 13 16:58:16 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 13 Oct 2011 16:58:16 +0200 Subject: [Python-ideas] Implement comparison operators for range objects References: <20111012163144.GB6393@pantoffel-wg.de> <20111012203326.GE6393@pantoffel-wg.de> <4E9602BC.40206@stoneleaf.us> Message-ID: <20111013165816.14d0606a@pitrou.net> On Wed, 12 Oct 2011 20:18:50 -0700 Guido van Rossum wrote: > > The open question so far is: How do we want our ranges to work? My > intuition is weak, but says: range(0) != range(1, 1) != range(1, 1, 2) > and range(0, 10, 2) != range(0, 11, 2); all because the arguments > (after filling in the defaults) are different, and those arguments can > come out using the start, stop, step attributes (once we implement > them :-). My intuition is contrary, but I think it comes down to: what is the use case for comparison ranges? > PS. An (unrelated) oddity with range and Decimal: > > >>> range(Decimal(10)) > Traceback (most recent call last): > File "", line 1, in > TypeError: 'Decimal' object cannot be interpreted as an integer > >>> range(int(Decimal(10))) > range(0, 10) > >>> > > So int() knows something that range() doesn't. :-) Same as floats: >>> int(1.0) 1 >>> range(1.0, 2, 1) Traceback (most recent call last): File "", line 1, in TypeError: 'float' object cannot be interpreted as an integer I thought it was by design? Regards Antoine. From matt at whoosh.ca Thu Oct 13 17:02:45 2011 From: matt at whoosh.ca (Matt Chaput) Date: Thu, 13 Oct 2011 11:02:45 -0400 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: Message-ID: <4E96FD95.50301@whoosh.ca> On 12/10/2011 8:22 PM, Nick Coghlan wrote: > PEP 403 is a far simpler idea, that looks to decorators (and Ruby > blocks) for inspiration. It's still a far from perfect idea, but it > has a lot more going for it than PEP 3150 ever did. -1 for syntactic ridiculousness... I personally find it (whether postdef or : or whatever) unreadable and probably unfathomable to beginners ("So, WHY do I have to give this function a name if it's just going to be ignored?" "Because nobody could think of a good syntax."). IMHO anonymous blocks (Amnesiac blocks? They had a name but they forgot it) are not worth complicating the language definition (that is, the mental model, not necessarily the implementation) to this degree. > The new PEP is included below and is also available online: > http://www.python.org/dev/peps/pep-0403/ The text in your email is different from the text on python.org. In particular, the first sentence in the version on python.org is unfinished: This PEP proposes the addition of postdef as a new function prefix syntax (analogous to decorators) that permits the execution of a single simple statement (potentially including substatements separated by semi-colons) after Matt From guido at python.org Thu Oct 13 19:30:19 2011 From: guido at python.org (Guido van Rossum) Date: Thu, 13 Oct 2011 10:30:19 -0700 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <20111012203326.GE6393@pantoffel-wg.de> <4E9602BC.40206@stoneleaf.us> Message-ID: On Wed, Oct 12, 2011 at 9:53 PM, Nick Coghlan wrote: > On Thu, Oct 13, 2011 at 1:18 PM, Guido van Rossum wrote: >> FWIW, I don't think the argument from numeric comparisons carries >> directly. The reason numeric comparisons (across int, float and >> Decimal) ignore certain "state" of the value (like precision or type) >> is that that's how we want our numbers to work. >> >> The open question so far is: How do we want our ranges to work? My >> intuition is weak, but says: range(0) != range(1, 1) != range(1, 1, 2) >> and range(0, 10, 2) != range(0, 11, 2); all because the arguments >> (after filling in the defaults) are different, and those arguments can >> come out using the start, stop, step attributes (once we implement >> them :-). > > Between this and Raymond's point about slicing permitting easy and > cheap normalisation of endpoints, I'm convinced that, if we add direct > comparison of ranges at all, then start/stop/step comparison is the > way to go. Thanks. Maybe I can nudge you a little more in the direction of my proposal by speaking about equivalence classes. A proper == function partitions the space of all objects into equivalence classes, which are non-overlapping sets such that all objects within one equivalence class are equal to each other, while no two objects in different classes are equal. (Let's leave NaN out of it for now; it does not have a "proper" == function.) There's a nice picture on this Wikipedia page: http://en.wikipedia.org/wiki/Equivalence_relation A trivial collection of equivalence classes is one where each object is in its own equivalence class. That's comparison-by-identity. It isn't very useful because we already have another operator that does the same partitioning. A more useful partitioning is the one which puts all range objects with the same start/stop/step triple into the same equivalence class. This is the one I (still) like best. Interestingly, the one that got the most votes so far is a proper "extension" of this one, in that equivalence according to equal start/stop/step triples implies equivalence according to this weaker definition. That's nice, because it means that there will probably be many use cases where either definition suffices (such as all use cases that only care about non-empty ranges with step==1). (Note: __hash__ needs to create equivalence classes that are proper extensions of those created by __eq__. In terms of the Wikipedia picture, an extension is allowed to merge some equivalence classes but not to split them.) BTW, I like Raymond's observation, and I agree that we should add slicing to range(), given that it already supports indexing; and slicing is a nice way to normalize the range. I just don't think that the status quo is better than either of the two proposed definitions for __eq__. Finally. Still waiting for actual use cases. >> PS. An (unrelated) oddity with range and Decimal: >> >>>>> range(Decimal(10)) >> Traceback (most recent call last): >> ?File "", line 1, in >> TypeError: 'Decimal' object cannot be interpreted as an integer >>>>> range(int(Decimal(10))) >> range(0, 10) >>>>> >> >> So int() knows something that range() doesn't. :-) > > Yeah, range() wants to keep floats far away, so it only checks > __index__, not __int__. So Decimal gets handled the same way float > does (i.e. not allowed directly, but permitted after explicit coercion > to an integer). Sorry, it all makes sense now. Please move on. Nothing to see here. :-) -- --Guido van Rossum (python.org/~guido) From python at zesty.ca Thu Oct 13 19:51:30 2011 From: python at zesty.ca (Ka-Ping Yee) Date: Thu, 13 Oct 2011 10:51:30 -0700 (PDT) Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: <4E96FD95.50301@whoosh.ca> References: <4E96FD95.50301@whoosh.ca> Message-ID: Hi, I dislike "postdef"; it and the leading colon are both mysterious. I guess that's because I find both of them visually misleading. In the case of ":", the colon visually binds to the first token, so :x = whatever(@...) looks like an assignment to ":x" and my brain immediately goes "What's :x?" I have to really work to force myself to make the tiny little ":" bind looser than everything else on the line. In the case of "postdef", the Python syntactic tradition is that a keyword at the beginning always means "this is a statement" and always describes the type of statement. "postdef" breaks both of these rules. This: postdef return whatever(@...) is NOT a postdef statement, it's a return statement. And this: postdef button.add_handler(@...) is not even a statement! It's an expression. ==== Finally, there's a grammar problem that hasn't been addressed yet. What about multi-line statements? Can you write: :if do_something(sorted(items, key=@)): def sort_key(item): # ??? where to indent this? .... # sort_key body ... # if body I sure hope not! ==== All of the above is making me really like the syntactic variations that put the ":" at the end of a line. This is visually natural to me: a colon introduces a block, and that's a good approximation of what is happening here. It also clearly demands one level of indentation, which seems like a good idea. And it has this nice property that it signals when you are allowed to use it: on any line that doesn't *already* end with a colon. If we made that the rule, then post-definitions would be allowed in these kinds of statements, which seems reasonable: assert assignment del exec expression global nonlocal print raise return yield --Ping From ubershmekel at gmail.com Thu Oct 13 20:08:36 2011 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Thu, 13 Oct 2011 14:08:36 -0400 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: <4E9687BE.9040204@canterbury.ac.nz> References: <20111012163144.GB6393@pantoffel-wg.de> <20111012203326.GE6393@pantoffel-wg.de> <4E9602BC.40206@stoneleaf.us> <4E9687BE.9040204@canterbury.ac.nz> Message-ID: +1 for refusing the temptation to guess. Both equality definitions don't seem obvious or handy enough to be favorited by python. That is until some prevalent use cases are presented. --Yuval -------------- next part -------------- An HTML attachment was scrubbed... URL: From shibturn at gmail.com Thu Oct 13 20:16:47 2011 From: shibturn at gmail.com (shibturn) Date: Thu, 13 Oct 2011 19:16:47 +0100 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <20111012203326.GE6393@pantoffel-wg.de> <4E9602BC.40206@stoneleaf.us> Message-ID: On 13/10/2011 6:30pm, Guido van Rossum wrote: > (Note: __hash__ needs to create equivalence classes that are proper > extensions of those created by __eq__. In terms of the Wikipedia > picture, an extension is allowed to merge some equivalence classes but > not to split them.) Actually cpython's dict lookup does not check equivalence of keys using __eq__ directly. Instead it uses something similar to def eq(a, b): return a.__hash__() == b.__hash__() and a.__eq__(b) This ensures compatibility with the equivalence classes for __hash__. (It is also an optimisation.) Cheers, sbt From guido at python.org Thu Oct 13 20:18:10 2011 From: guido at python.org (Guido van Rossum) Date: Thu, 13 Oct 2011 11:18:10 -0700 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <20111012203326.GE6393@pantoffel-wg.de> <4E9602BC.40206@stoneleaf.us> <4E9687BE.9040204@canterbury.ac.nz> Message-ID: On Thu, Oct 13, 2011 at 11:08 AM, Yuval Greenfield wrote: > +1 for refusing the temptation to guess. > > Both equality definitions don't seem obvious or handy enough to be favorited > by python. That is until some prevalent use cases are presented. Ah, but the stricter equality definition (by start/stop/step) also refuses to guess! It doesn't consider range(0, 0) and range(1, 1) as equivalent because, indeed, it would have to guess. But it will consider range(1) == range(1) since everybody considers those equivalent so there's no guess-work involved. The identity-based __eq__ does nobody any good. -- --Guido van Rossum (python.org/~guido) From sven at marnach.net Thu Oct 13 22:29:06 2011 From: sven at marnach.net (Sven Marnach) Date: Thu, 13 Oct 2011 21:29:06 +0100 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <20111012203326.GE6393@pantoffel-wg.de> <4E9602BC.40206@stoneleaf.us> Message-ID: <20111013202906.GJ6393@pantoffel-wg.de> Guido van Rossum wrote: > Thanks. Maybe I can nudge you a little more in the direction of my > proposal by speaking about equivalence classes. A proper == function > partitions the space of all objects into equivalence classes, which > are non-overlapping sets such that all objects within one equivalence > class are equal to each other, while no two objects in different > classes are equal. (Let's leave NaN out of it for now; it does not > have a "proper" == function.) There's a nice picture on this Wikipedia > page: http://en.wikipedia.org/wiki/Equivalence_relation Both proposals define proper equivalence classes -- there is no difference in this regard. In one proposal, equivalence is defined by identical behaviour, in the other equivalence is defined by identical parameters at creation time. I still strongly lean towards the definition based on identical behaviour. If it wasn't for this particular choice of representation of ranges, there wouldn't be any way to distinguish the objects range(3, 8, 3) and range(3, 9, 3) They would be the same in every respect. It feels entirely artificial to me to consider them as not equal just because we used different parameters to create them (and someone chose to include these parameters in the representation). > BTW, I like Raymond's observation, and I agree that we should add > slicing to range(), given that it already supports indexing; and > slicing is a nice way to normalize the range. I just don't think that > the status quo is better than either of the two proposed definitions > for __eq__. range() already supports slicing, and it already does this: >>> r0 = range(3, 8, 3) >>> r1 = r[:] >>> r1 range(3, 9, 3) If we adopted equality based on start/stop/step, this would lead to the somewhat paradoxical situation that r0 != r0[:], in contrast to the behaviour of all other sequences in Python. Cheers, Sven From greg.ewing at canterbury.ac.nz Thu Oct 13 23:01:35 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Oct 2011 10:01:35 +1300 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: Message-ID: <4E9751AF.4070705@canterbury.ac.nz> Georg Brandl wrote: > Sorry, I don't think this looks like Python anymore. Defining a class > just to get at a throwaway namespace? Using "@" as an identifier? > Using ":" not as a suite marker? > > This doesn't have any way for a casual reader to understand what's > going on. I have to agree. I think this proposal is a huge step backwards from the very elegant and self-explanatory syntax of PEP 3150. Withdrawing PEP 3150 altogether seems like an over- reaction to me. A lot of its problems would go away if the idea of trying to make the names local to the suite were dropped. That part doesn't seem particularly important to me -- we manage to live without the for-loop putting its variable into a local scope, even though it would be tidier if it did. -- Greg From bruce at leapyear.org Thu Oct 13 23:13:50 2011 From: bruce at leapyear.org (Bruce Leban) Date: Thu, 13 Oct 2011 14:13:50 -0700 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: <20111013122706.GH6393@pantoffel-wg.de> References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> <20111013122706.GH6393@pantoffel-wg.de> Message-ID: On Thu, Oct 13, 2011 at 5:27 AM, Sven Marnach wrote: > > The *current interface* of ranges is that of a sequence and nothing > more. The only place where the original parameters that were used to > create the sequence pop up is in its representation. Giving those > parameters more weight seems a bit like remembering the list > coomprehension that was used to create a list and take this into > account when comparing lists (of course I'm exaggerating here to make > a point). *[emphasis added]* > The current interface doesn't include the changes you want to make either. Limiting the consideration of extending the interface of ranges to exactly your proposal is unnecessarily limiting. Suppose we modified ranges to allow changing the step value. >>> x = range(0,10,3) >>> x range(0, 10, 3) >>> x.set_step(2) range(0, 10, 2) Now I'm not saying we *should* allow you to change the step value but we *could*. And if we did that then, the original stop value definitely matters. Making the decision now to ignore that value when comparing for equality means that we preclude other features in the future. --- Bruce Follow me: http://www.twitter.com/Vroo http://www.vroospeak.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Oct 13 23:14:14 2011 From: guido at python.org (Guido van Rossum) Date: Thu, 13 Oct 2011 14:14:14 -0700 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: <20111013202906.GJ6393@pantoffel-wg.de> References: <20111012163144.GB6393@pantoffel-wg.de> <20111012203326.GE6393@pantoffel-wg.de> <4E9602BC.40206@stoneleaf.us> <20111013202906.GJ6393@pantoffel-wg.de> Message-ID: On Thu, Oct 13, 2011 at 1:29 PM, Sven Marnach wrote: > Both proposals define proper equivalence classes -- there is no > difference in this regard. I know. My point was that the equivalence classes "match up" in a way -- each equivalence class in the "as sequence" proposal comprises exactly one or more of the equivalence classes in the "as start/stop/step" proposal. (This is also why I had an aside about __hash__: each equivalence class for __hash__ comprises exactly one or more of the equivalence classes for __eq__.) -- --Guido van Rossum (python.org/~guido) From greg.ewing at canterbury.ac.nz Thu Oct 13 23:46:36 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Oct 2011 10:46:36 +1300 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <2CEE38E9-AFC5-440F-B6EB-87D0403114AC@gmail.com> Message-ID: <4E975C3C.1090707@canterbury.ac.nz> Nick Coghlan wrote: > postdef x = weakref.ref(obj, def) > def report_destruction(obj): > print("{} is being destroyed".format(obj)) > > postdef funcs = [def(i) for i in range(10)] > def make_incrementor(i): > postdef return def > def incrementor(x): > return x + i > > postdef sorted_list = sorted(original, key=def) > def normalise(item): > ? > > That actually looks quite readable to me, and is fairly explicit about > what it does: Sorry, but I think it only looks that way to you because you invented it. To me, all of these look like two completely separate statements: some weird thing starting with "postdef", and then a function definition. It might be slightly better if the subsequent def were indented and made into a suite: postdef x = weakref.ref(obj, def): def report_destruction(obj): print("{} is being destroyed".format(obj)) Now it's very clear that the def is a subordinate clause. > With this variant, I would suggest that any postdef clause be executed > *in addition* to the normal name binding. Colliding on dummy names > like "func" would then be like colliding on loop variables like "i" - > typically harmless, because you don't use the names outside the > constructs that define them anyway. That sounds reasonable. However, if we're binding the name anyway, why not use it in the main clause instead of introducing something weird like a lone 'def'? postdef x = weakref.ref(obj, report_destruction): def report_destruction(obj): print("{} is being destroyed".format(obj)) This makes it even more obvious what's going on. Furthermore, there's no longer any need to restrict ourselves to a single 'def' in the body, or even any need for the defining statement to be a 'def' -- anything that binds a name would do. We've now arrived at something very like PEP 3150, but without the local-namespace idea that was causing all the difficulties. -- Greg From greg.ewing at canterbury.ac.nz Thu Oct 13 23:57:12 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Oct 2011 10:57:12 +1300 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <2CEE38E9-AFC5-440F-B6EB-87D0403114AC@gmail.com> Message-ID: <4E975EB8.6050009@canterbury.ac.nz> Paul Moore wrote: > I think that the semantics > are sufficiently Pythonic, it's just the syntax that is jarring, but > it *is* something to be careful of. Decorators felt much the same when > they were introduced, though... They *still* feel that way to me, even now. This is unusual. Every other addition to the language I've eventually come to like, even if I was unsure about it at the time. But to me decorators still look like something awkwardly grafted on from another universe. I would hate to see any *more* things like that added to the language. -- Greg From ben+python at benfinney.id.au Fri Oct 14 00:02:06 2011 From: ben+python at benfinney.id.au (Ben Finney) Date: Fri, 14 Oct 2011 09:02:06 +1100 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) References: <2CEE38E9-AFC5-440F-B6EB-87D0403114AC@gmail.com> <4E975C3C.1090707@canterbury.ac.nz> Message-ID: <87ty7cv9cx.fsf@benfinney.id.au> Greg Ewing writes: > Nick Coghlan wrote: > > > postdef x = weakref.ref(obj, def) > > def report_destruction(obj): > > print("{} is being destroyed".format(obj)) > > > > postdef funcs = [def(i) for i in range(10)] > > def make_incrementor(i): > > postdef return def > > def incrementor(x): > > return x + i > > > > postdef sorted_list = sorted(original, key=def) > > def normalise(item): > > ? > > > > That actually looks quite readable to me, and is fairly explicit about > > what it does: > > Sorry, but I think it only looks that way to you because you invented > it. To me, all of these look like two completely separate statements: > some weird thing starting with "postdef", and then a function > definition. I have to agree. The decorator syntax was hotly debated for (in part) the very same reason: when looking at the definition of a function, a statement *preceding* the definition is not obviously connected. -- \ ?Everyone is entitled to their own opinions, but they are not | `\ entitled to their own facts.? ?US Senator Pat Moynihan | _o__) | Ben Finney From ncoghlan at gmail.com Fri Oct 14 02:01:58 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 14 Oct 2011 10:01:58 +1000 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: <4E9751AF.4070705@canterbury.ac.nz> References: <4E9751AF.4070705@canterbury.ac.nz> Message-ID: On Fri, Oct 14, 2011 at 7:01 AM, Greg Ewing wrote: > Withdrawing PEP 3150 altogether seems like an over- > reaction to me. A lot of its problems would go away > if the idea of trying to make the names local to the > suite were dropped. That part doesn't seem particularly > important to me -- we manage to live without the > for-loop putting its variable into a local scope, > even though it would be tidier if it did. So, keep the PEP 3150 syntax, but don't make the inner suite special aside from the out of order execution? While that would work, it still feels overly heavy for what I consider the primary use case of the construct: sorted_list = sorted(original, key=key_func) given: def key_func(item): return item.attr1, item.attr2 The heart of the problem is that the name 'key_func' is repeated twice, encouraging short, cryptic throwaway names. Maybe I'm worrying too much about that, though - it really is the out of order execution that is needed in order to let the flow of the Python code match the way the developer is thinking about their problem. I'll note that the evolution from PEP 3150 (as shown above) to PEP 403 went as follows: 1. Make the inner suite a true anonymous function with the signature on the header line after the 'given' clause. Reference the function via '@' since it is otherwise inaccessible. sorted_list = sorted(original, key=@) given (item): return item.attr1, item.attr2 2. Huh, that 'given' keyword doesn't scream 'anonymous function'. How about 'def' instead? sorted_list = sorted(original, key=@) def (item): return item.attr1, item.attr2 3. Huh, that looks almost exactly like decorator prefix syntax. And the callable signature is way over on the RHS. What if we move it to the next line? sorted_list = sorted(original, key=@) def (item): return item.attr1, item.attr2 4. We may as well let people add a name for debugging purposes, and it's less work to just make it compulsory to match existing syntax. By keeping the shorthand symbolic reference, we get the best of both worlds: a descriptive name for debugging purposes, a short local name for ease of use. sorted_list = sorted(original, key=@) def key_func(item): return item.attr1, item.attr2 5. Well, the parser won't like that and it's backwards incompatible anyway. We need something to flag the prefix line as special. ':' will do. :sorted_list = sorted(original, key=@) def key_func(item): return item.attr1, item.attr2 6. Keywords are better than symbols, so let's try that instead postdef sorted_list = sorted(original, key=def) def key_func(item): return item.attr1, item.attr2 PEP 403 really is just an extension of the principles behind decorators at its heart, so I think it makes sense for those semantics to have a decorator-style syntax. If we want to revert back to using an indented suite, than I think it makes more sense to go all the way back to PEP 3150 and discuss the relative importance of "out of order execution" and "private scope to avoid namespace pollution". Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ron3200 at gmail.com Fri Oct 14 03:10:00 2011 From: ron3200 at gmail.com (Ron Adam) Date: Thu, 13 Oct 2011 20:10:00 -0500 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <4E9751AF.4070705@canterbury.ac.nz> Message-ID: <1318554600.460.83.camel@Gutsy> On Fri, 2011-10-14 at 10:01 +1000, Nick Coghlan wrote: > On Fri, Oct 14, 2011 at 7:01 AM, Greg Ewing wrote: > > Withdrawing PEP 3150 altogether seems like an over- > > reaction to me. A lot of its problems would go away > > if the idea of trying to make the names local to the > > suite were dropped. That part doesn't seem particularly > > important to me -- we manage to live without the > > for-loop putting its variable into a local scope, > > even though it would be tidier if it did. > > So, keep the PEP 3150 syntax, but don't make the inner suite special > aside from the out of order execution? > > While that would work, it still feels overly heavy for what I consider > the primary use case of the construct: > > sorted_list = sorted(original, key=key_func) given: > def key_func(item): > return item.attr1, item.attr2 > > The heart of the problem is that the name 'key_func' is repeated > twice, encouraging short, cryptic throwaway names. Maybe I'm worrying > too much about that, though - it really is the out of order execution > that is needed in order to let the flow of the Python code match the > way the developer is thinking about their problem. Yes, you are worrying too much about this. :-) I like things that can be taken apart and put together again in a program and by a program. And even put together in new ways by a program. The special syntax and requirement that they be localized together doesn't fit that. Which means each time you want something similar to it, but with maybe with a small variation, you must rewrite the whole thing by hand. (Yes, we saved a few keystrokes each time, but...) To me that is going backwards and is a counter point to the purpose of programming. Hey lets keep those programmers working! JK ;-) > I'll note that the evolution from PEP 3150 (as shown above) to PEP 403 > went as follows: > > 1. Make the inner suite a true anonymous function with the signature > on the header line after the 'given' clause. Reference the function > via '@' since it is otherwise inaccessible. > > sorted_list = sorted(original, key=@) given (item): > return item.attr1, item.attr2 > > 2. Huh, that 'given' keyword doesn't scream 'anonymous function'. How > about 'def' instead? > > sorted_list = sorted(original, key=@) def (item): > return item.attr1, item.attr2 > > 3. Huh, that looks almost exactly like decorator prefix syntax. And > the callable signature is way over on the RHS. What if we move it to > the next line? > > sorted_list = sorted(original, key=@) > def (item): > return item.attr1, item.attr2 > > 4. We may as well let people add a name for debugging purposes, and > it's less work to just make it compulsory to match existing syntax. By > keeping the shorthand symbolic reference, we get the best of both > worlds: a descriptive name for debugging purposes, a short local name > for ease of use. > > sorted_list = sorted(original, key=@) > def key_func(item): > return item.attr1, item.attr2 > > 5. Well, the parser won't like that and it's backwards incompatible > anyway. We need something to flag the prefix line as special. ':' will > do. > > :sorted_list = sorted(original, key=@) > def key_func(item): > return item.attr1, item.attr2 > > 6. Keywords are better than symbols, so let's try that instead > > postdef sorted_list = sorted(original, key=def) > def key_func(item): > return item.attr1, item.attr2 > > PEP 403 really is just an extension of the principles behind > decorators at its heart, so I think it makes sense for those semantics > to have a decorator-style syntax. If we want to revert back to using > an indented suite, than I think it makes more sense to go all the way > back to PEP 3150 and discuss the relative importance of "out of order > execution" and "private scope to avoid namespace pollution". Decorators can be moved out of the way and reused multiple times. I don't think this syntax can do that. It locks the two pieces together so they can't be used separately. Using your sort example. Lets say we rewrite that concept so it's more general using a decorator that can be reused. In this case, the key isn't the reusable part. It's the part that changes when we do sorts, so it's the part we put the decorator on to create a new sorted variation. def sorted_with(key): def _(seq): return sorted(seq, key=key) return _ @sorted_with def key_sorted(item): return item.attr1, item.attr2 new_list = key_sorted(original_list) We can reuse the sorted_with decorator with as may key functions as we want. That reuse is an important feature of decorators. Cheers, Ron From ericsnowcurrently at gmail.com Fri Oct 14 03:51:28 2011 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 13 Oct 2011 19:51:28 -0600 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <4E9751AF.4070705@canterbury.ac.nz> Message-ID: On Thu, Oct 13, 2011 at 6:01 PM, Nick Coghlan wrote: > On Fri, Oct 14, 2011 at 7:01 AM, Greg Ewing wrote: >> Withdrawing PEP 3150 altogether seems like an over- >> reaction to me. A lot of its problems would go away >> if the idea of trying to make the names local to the >> suite were dropped. That part doesn't seem particularly >> important to me -- we manage to live without the >> for-loop putting its variable into a local scope, >> even though it would be tidier if it did. > > So, keep the PEP 3150 syntax, but don't make the inner suite special > aside from the out of order execution? > > While that would work, it still feels overly heavy for what I consider > the primary use case of the construct: > > ? ?sorted_list = sorted(original, key=key_func) given: > ? ? ? ?def key_func(item): > ? ? ? ? ? ?return item.attr1, item.attr2 > > The heart of the problem is that the name 'key_func' is repeated > twice, encouraging short, cryptic throwaway names. Maybe I'm worrying > too much about that, though - it really is the out of order execution > that is needed in order to let the flow of the Python code match the > way the developer is thinking about their problem. > > I'll note that the evolution from PEP 3150 (as shown above) to PEP 403 > went as follows: > > 1. Make the inner suite a true anonymous function with the signature > on the header line after the 'given' clause. Reference the function > via '@' since it is otherwise inaccessible. > > ? ?sorted_list = sorted(original, key=@) given (item): > ? ? ? ?return item.attr1, item.attr2 > > 2. Huh, that 'given' keyword doesn't scream 'anonymous function'. How > about 'def' instead? > > ? ?sorted_list = sorted(original, key=@) def (item): > ? ? ? ?return item.attr1, item.attr2 > > 3. Huh, that looks almost exactly like decorator prefix syntax. And > the callable signature is way over on the RHS. What if we move it to > the next line? > > ? ?sorted_list = sorted(original, key=@) > ? ?def (item): > ? ? ? ?return item.attr1, item.attr2 > > 4. We may as well let people add a name for debugging purposes, and > it's less work to just make it compulsory to match existing syntax. By > keeping the shorthand symbolic reference, we get the best of both > worlds: a descriptive name for debugging purposes, a short local name > for ease of use. > > ? ?sorted_list = sorted(original, key=@) > ? ?def key_func(item): > ? ? ? ?return item.attr1, item.attr2 > > 5. Well, the parser won't like that and it's backwards incompatible > anyway. We need something to flag the prefix line as special. ':' will > do. > > ? ?:sorted_list = sorted(original, key=@) > ? ?def key_func(item): > ? ? ? ?return item.attr1, item.attr2 > > 6. Keywords are better than symbols, so let's try that instead > > ? ?postdef sorted_list = sorted(original, key=def) > ? ?def key_func(item): > ? ? ? ?return item.attr1, item.attr2 > > PEP 403 really is just an extension of the principles behind > decorators at its heart, so I think it makes sense for those semantics > to have a decorator-style syntax. Yeah, I was explaining the idea to someone today and the decorator connection clicked. It seems like this new syntax is giving you temporary access to the anonymous object on the frame stack, almost like a limited access to a special "frame scope". The decorator syntax provides this by passing that anonymous object in as the argument to the decorator. Is that an accurate perspective or did I misunderstand? -eric > If we want to revert back to using > an indented suite, than I think it makes more sense to go all the way > back to PEP 3150 and discuss the relative importance of "out of order > execution" and "private scope to avoid namespace pollution". > > Cheers, > Nick. > > -- > Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From cmjohnson.mailinglist at gmail.com Fri Oct 14 05:07:43 2011 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Thu, 13 Oct 2011 17:07:43 -1000 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <4E9751AF.4070705@canterbury.ac.nz> Message-ID: <4FB2E81A-D407-4D97-BC87-6B141404F093@gmail.com> On Oct 13, 2011, at 2:01 PM, Nick Coghlan wrote: > On Fri, Oct 14, 2011 at 7:01 AM, Greg Ewing wrote: >> Withdrawing PEP 3150 altogether seems like an over- >> reaction to me. A lot of its problems would go away >> if the idea of trying to make the names local to the >> suite were dropped. That part doesn't seem particularly >> important to me -- we manage to live without the >> for-loop putting its variable into a local scope, >> even though it would be tidier if it did. > > So, keep the PEP 3150 syntax, but don't make the inner suite special > aside from the out of order execution? To me the limitations of 403 are its strength. I don't want to see people doing crazy inside-out code. Let's say we had a keyword OoO for "out of order": OoO result = flange(thingie): OoO thing = doohickie(majiger): part_a = source / 2 part_b = source ** 2 majiger = blender(part_a, part_b) This violates the spirit of One Obvious Way from my perspective. If you have an OoO keyword, for every sequence of code you end up asking yourself, "Hmm, would this make more sense if I did it backwards or forwards??" That will lead to a bunch of style guide wars and in many cases result, worse readability. By comparison, PEP 403 allows you to do one thing (or two if classes are allowed). You can place a callback underneath the receiver of the callback. That's more or less it. The answer to "when should I use PEP 403?" is very clear: "When you're never going to pass a callback function to anything other than the one receiver." In the same way, "When do I use an @decorator?" has a clear answer: "When you're never going to want to use the undecorated form." I can imagine a little bit of tweaking around the edges for this (see below), but otherwise, I think the core theory of PEP 403 is just right. Slight tweak: postdef items = sorted(items, key=@keyfunc(item)): item = item.lower() item = item.replace(" ", "") ... return item In this case "keyfunc" is the name of the function (not that anyone will ever see it) and (item) is the argument list. I'm not convinced this tweak is better than having a full def on the line below, but it's worth considering. From ncoghlan at gmail.com Fri Oct 14 05:20:13 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 14 Oct 2011 13:20:13 +1000 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: <1318554600.460.83.camel@Gutsy> References: <4E9751AF.4070705@canterbury.ac.nz> <1318554600.460.83.camel@Gutsy> Message-ID: On Fri, Oct 14, 2011 at 11:10 AM, Ron Adam wrote: > Using your sort example. ?Lets say we rewrite that concept so it's more > general using a decorator that can be reused. ?In this case, the key > isn't the reusable part. ?It's the part that changes when we do sorts, > so it's the part we put the decorator on to create a new sorted > variation. > > > def sorted_with(key): > ? ?def _(seq): > ? ? ? ?return sorted(seq, key=key) > ? ?return _ > > > @sorted_with > def key_sorted(item): > ? return item.attr1, item.attr2 > > new_list = key_sorted(original_list) Seriously? You're suggesting that mess of symbols and indentation as a good answer to the programming task "I want to sort the items in this sequence according to the values of attr1 and attr2"? The closest Python currently comes to being able to express that concept cleanly is either: sorted_list = sorted(original, key=(lambda v: v.attr1, v.attr2)) or: sorted_list = sorted(original, key=operator.attrgetter('attr1', 'attr2')) Both of those work, but neither of them reaches the bar of "executable pseudocode" the language prides itself on. PEP 403 also fails pretty abysmally (alas) on that front: postdef sorted_list = sorted(original, key=sort_key) def sort_key(item): return item.attr1, item.attr2 The named function version fails because it gets things out of order: def sort_key(item): return item.attr1, item.attr2 sorted_list = sorted(original, key=sort_key) That's more like pseudo code for "First, define a function that returns an object's attr1 and attr2 values. Than use that function to sort our list", a far cry from the originally requested operation. PEP 3150, on the other hand, actually gets close to achieving the pseudocode standard: sorted_list = sorted(original, key=sort_key) given: def sort_key(item): return item.attr1, item.attr2 "Sort the items in this sequence according to the supplied key function. The key function returns the values of attr1 and attr2 for each item." > We can reuse the sorted_with decorator with as may key functions as we > want. ?That reuse is an important feature of decorators. No, no, no - this focus on reusability is *exactly* the problem. It's why callback programming in Python sucks - we force people to treat one-shot functions as if they were reusable ones, *even when those functions aren't going to be reused in any other statement*. That's the key realisation that I finally came to in understanding the appeal of multi-line lambdas (via Ruby's block syntax): functions actually have two purposes in life. The first is the way they're traditionally presented: as a way to structure algorithms into reusable chunks, so you don't have to repeat yourself. However, the second is to simply hand over a section of an algorithm to be executed by someone else. You don't *care* about reusability in those cases - you care about handing the code you're currently writing over to be executed by some other piece of code. Python only offers robust syntax for the first use case, which is always going to cause mental friction when you're only interested in the latter aspect. Interestingly, the main thing I'm getting out of this discussion is more of an idea of why PEP 3150 has fascinated me for so long. I expect the outcome is going to be that 403 gets withdrawn and 3150 resuscitated :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Fri Oct 14 05:38:32 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 14 Oct 2011 13:38:32 +1000 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: <4FB2E81A-D407-4D97-BC87-6B141404F093@gmail.com> References: <4E9751AF.4070705@canterbury.ac.nz> <4FB2E81A-D407-4D97-BC87-6B141404F093@gmail.com> Message-ID: On Fri, Oct 14, 2011 at 1:07 PM, Carl M. Johnson wrote: > To me the limitations of 403 are its strength. I don't want to see people doing crazy inside-out code. This has long been my fear with PEP 3150, but I'm beginning to wonder if that fear is overblown. > Let's say we had a keyword OoO for "out of order": > > OoO result = flange(thingie): > ? ? ? ?OoO thing = doohickie(majiger): > ? ? ? ? ? ? ? ?part_a = source / 2 > ? ? ? ? ? ? ? ?part_b = source ** 2 > ? ? ? ? ? ? ? ?majiger = blender(part_a, part_b) But now consider this: part_a = source / 2 part_b = source ** 2 majiger = blender(part_a, part_b) thingie = doohickie(majiger) result = flange(thingie) And now with a couple of operations factored out into functions: def make_majiger(source): part_a = source / 2 part_b = source ** 2 return blender(part_a, part_b) def make_thingie(source): return doohickie(make_majigger(source)) result = flange(make_thingie(source)) All PEP 3150 is allowing you to do is indent stuff that could potentially be factored out into a function at some point, without forcing you to factor it out *right now*: result = flange(thingie) given: thingie = doohickie(majiger) given: part_a = source / 2 part_b = source ** 2 majiger = blender(part_a, part_b) So is the "inline vs given statement" question really any more scary than the "should I factor this out into a function" question? I expect the style guidelines will boil down to: - if you're writing a sequential process "first we do A, then we do B, then we do C, etc", use normal inline code - if you're factoring out subexpressions from a single step in an algorithm, use a given statement - as a general rule, given statements should be used for code that could reasonably be factored out into its own function Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From scott+python-ideas at scottdial.com Fri Oct 14 06:19:34 2011 From: scott+python-ideas at scottdial.com (Scott Dial) Date: Fri, 14 Oct 2011 00:19:34 -0400 Subject: [Python-ideas] If I had the time machine for comparisons In-Reply-To: References: Message-ID: <4E97B856.8040609@scottdial.com> On 10/13/2011 10:06 AM, Jim Jewett wrote: > I think that > > range(3) < range(10) > > is obviously true Based on what property? Is that because: a) len(range(3)) < len(range(10)) b) max(range(3)) < min(range(10)) c) ((min(range(3)) < min(range(10))) and (max(range(3)) < max(range(10))) I guess your argument is that in this degenerate case, all of these properties are true so it is "obviously true". Personally, if I can't pin-point the exact reason that "x < y", then it's not obvious even if for every definition of "<" I can come up with it is a true statement, because there is not one obvious definition that is true. In other words, I don't know what "x < y" means for ranges in general so I can't reason about it in general, therefore this special case is not useful or obvious. In the absence of explicit arguments to the "<" operand, I would be baffled as to what to expect as the result. However, I can understand the desire to test for range equality, and the definition for that is significantly more obvious. -- Scott Dial scott at scottdial.com From ron3200 at gmail.com Fri Oct 14 07:55:50 2011 From: ron3200 at gmail.com (Ron Adam) Date: Fri, 14 Oct 2011 00:55:50 -0500 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <4E9751AF.4070705@canterbury.ac.nz> <1318554600.460.83.camel@Gutsy> Message-ID: <1318571750.1494.114.camel@Gutsy> On Fri, 2011-10-14 at 13:20 +1000, Nick Coghlan wrote: > On Fri, Oct 14, 2011 at 11:10 AM, Ron Adam wrote: > > Using your sort example. Lets say we rewrite that concept so it's more > > general using a decorator that can be reused. In this case, the key > > isn't the reusable part. It's the part that changes when we do sorts, > > so it's the part we put the decorator on to create a new sorted > > variation. > > > > > > def sorted_with(key): > > def _(seq): > > return sorted(seq, key=key) > > return _ > > > > > > @sorted_with > > def key_sorted(item): > > return item.attr1, item.attr2 > > > > new_list = key_sorted(original_list) > > Seriously? You're suggesting that mess of symbols and indentation as a > good answer to the programming task "I want to sort the items in this > sequence according to the values of attr1 and attr2"? It's just a simple example of how you could do this. I'd probably just use a lmabda expression myself. I think the concept you are looking for is, how can you express a dependency in a natural order that also reads well. In this case, the end result cannot be built until all the parts are present. Class's are constructed top down. You create the framework and then fill in the parts. Functions (including decorators) are constructed from the inside out. That isn't always the easiest way to think about a problem. So a second separate goal is to have very concise one time expressions or statements. > Both of those work, but neither of them reaches the bar of "executable > pseudocode" the language prides itself on. > > PEP 403 also fails pretty abysmally (alas) on that front: > > postdef sorted_list = sorted(original, key=sort_key) > def sort_key(item): > return item.attr1, item.attr2 > > The named function version fails because it gets things out of order: Right and also, the 'postdef" keyword looks like it should result in a defined function, but it's actually a call here. Thats probably because I'm so used to seeing the 'def' in that way. > def sort_key(item): > return item.attr1, item.attr2 > > sorted_list = sorted(original, key=sort_key) > > That's more like pseudo code for "First, define a function that > returns an object's attr1 and attr2 values. Than use that function to > sort our list", a far cry from the originally requested operation. I think "far cry" is over stating it a bit. I think this sort of issue is only a problem for very new programmers. Once they understand functions and how to used them together to make more complex things, they get used to this. > PEP 3150, on the other hand, actually gets close to achieving the > pseudocode standard: > > sorted_list = sorted(original, key=sort_key) given: > def sort_key(item): > return item.attr1, item.attr2 > > "Sort the items in this sequence according to the supplied key > function. The key function returns the values of attr1 and attr2 for > each item." Right, we are constructing the framework first, and then the sub parts. But more specifically, we are suspending a piece of code, until it is safe to unsuspend it. I suppose you could use the '$' in some way to indicate a suspended bit of code. An 'S' with a line through it does map well to 'suspended'. > > We can reuse the sorted_with decorator with as may key functions as we > > want. That reuse is an important feature of decorators. > > No, no, no - this focus on reusability is *exactly* the problem. It's > why callback programming in Python sucks - we force people to treat > one-shot functions as if they were reusable ones, *even when those > functions aren't going to be reused in any other statement*. We have to use functions in call backs because an expression executes immediately rather than when it's needed. Also a call back usually needs some sort of input at the time it's used which isn't available before hand. Because python's functions are objects, it makes it much easier to do, So it's not really that bad once you figure it out. Non objective languages are more difficult in this respect. > You don't *care* about reusability in those cases - > you care about handing the code you're currently writing over to be > executed by some other piece of code I think that is 'reusable'. And most likely it will be reused over and over if we are referring to a GUI. Some call back examples might be good. Cheers, Ron From greg.ewing at canterbury.ac.nz Fri Oct 14 08:56:09 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Oct 2011 19:56:09 +1300 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <4E9751AF.4070705@canterbury.ac.nz> <4FB2E81A-D407-4D97-BC87-6B141404F093@gmail.com> Message-ID: <4E97DD09.3000204@canterbury.ac.nz> Nick Coghlan wrote: > So is the "inline vs given statement" question really any more scary > than the "should I factor this out into a function" question? Especially since the factored-out functions could be written either before or after the place where they're used, maybe not even on the same page, and maybe not even in the same source file... > I expect the style guidelines will boil down to: > ... > - as a general rule, given statements should be used for code that > could reasonably be factored out into its own function Also perhaps: - If you're assigning some things to local names and then using them just once in a subsequent expression, consider using a 'given' statement. I would also be inclined to add that if you're using the 'given' style in a particular function, you should use it *consistently*. For instance, consider the earlier example > result = flange(thingie) given: > thingie = doohickie(majiger) given: > part_a = source / 2 > part_b = source ** 2 > majiger = blender(part_a, part_b) > Why have we used the 'given' style for 'thingie' and 'majiger', but fallen back on the sequential style for 'part_a' and 'part_b'? It would be more consistent to write it as result = flange(thingie) given: thingie = doohickie(majiger) given: majiger = blender(part_a, part_b) given: part_a = source / 2 part_b = source ** 2 The flow of data from one place to another within the function is now very clear: source -----> part_a -----> blender --> doohickie --> flange | | \--> part_b --/ -- Greg From greg.ewing at canterbury.ac.nz Fri Oct 14 09:03:55 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Oct 2011 20:03:55 +1300 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: <1318571750.1494.114.camel@Gutsy> References: <4E9751AF.4070705@canterbury.ac.nz> <1318554600.460.83.camel@Gutsy> <1318571750.1494.114.camel@Gutsy> Message-ID: <4E97DEDB.3000003@canterbury.ac.nz> Ron Adam wrote: > Class's are constructed top down. You create the framework and then > fill in the parts. Actually, they're not -- the class object is created *after* executing the class body! So this is a (small) precedent for writing things out of order when it helps. -- Greg From greg.ewing at canterbury.ac.nz Fri Oct 14 09:08:50 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Oct 2011 20:08:50 +1300 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <4E9751AF.4070705@canterbury.ac.nz> <1318554600.460.83.camel@Gutsy> Message-ID: <4E97E002.2060605@canterbury.ac.nz> Nick Coghlan wrote: > On Fri, Oct 14, 2011 at 11:10 AM, Ron Adam wrote: > >>def sorted_with(key): >> def _(seq): >> return sorted(seq, key=key) >> return _ >> >>@sorted_with >>def key_sorted(item): >> return item.attr1, item.attr2 > > Seriously? You're suggesting that mess of symbols and indentation as a > good answer to the programming task "I want to sort the items in this > sequence according to the values of attr1 and attr2"? I don't think that example is as well-constructed as it could be. The key_sorted function would be better named something like 'sort_by_attr1_and_attr2'. -- Greg From greg.ewing at canterbury.ac.nz Fri Oct 14 09:15:43 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Oct 2011 20:15:43 +1300 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <4E9751AF.4070705@canterbury.ac.nz> Message-ID: <4E97E19F.7010502@canterbury.ac.nz> Nick Coghlan wrote: > The heart of the problem is that the name 'key_func' is repeated > twice, encouraging short, cryptic throwaway names. Seems to me you're encouraging that even more by requiring a name that doesn't even get used -- leading to suggestions such as using '_' as the name. > Maybe I'm worrying too much about that, though Yes, I think you are. If an exception occurs in the callback function, the traceback is still going to show you the source line where it occurred, so there won't be any trouble tracking it down. Also there are already plenty of situations where the name of the function by itself doesn't tell you very much. It's common to have many methods with the same name defined in different classes. -- Greg From cmjohnson.mailinglist at gmail.com Fri Oct 14 09:18:28 2011 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Thu, 13 Oct 2011 21:18:28 -1000 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: <4E97DD09.3000204@canterbury.ac.nz> References: <4E9751AF.4070705@canterbury.ac.nz> <4FB2E81A-D407-4D97-BC87-6B141404F093@gmail.com> <4E97DD09.3000204@canterbury.ac.nz> Message-ID: <0E3AC970-A45B-44FF-A2B6-E4BFC5BA8040@gmail.com> On Oct 13, 2011, at 8:56 PM, Greg Ewing wrote: > I would also be inclined to add that if you're using the > 'given' style in a particular function, you should use > it *consistently*. For instance, consider the earlier example > >> result = flange(thingie) given: >> thingie = doohickie(majiger) given: >> part_a = source / 2 >> part_b = source ** 2 >> majiger = blender(part_a, part_b) > > Why have we used the 'given' style for 'thingie' and > 'majiger', but fallen back on the sequential style for > 'part_a' and 'part_b'? That's more-or-less my point. The ability to write arbitrary code with given will create a lot of arguments about style. One person will use it in one place but not another, then someone will say, hey, let's make it more consistent, but then that's going too far? and it becomes an attractive nuisance for pointless rewrites. From greg.ewing at canterbury.ac.nz Fri Oct 14 09:34:17 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Oct 2011 20:34:17 +1300 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: <0E3AC970-A45B-44FF-A2B6-E4BFC5BA8040@gmail.com> References: <4E9751AF.4070705@canterbury.ac.nz> <4FB2E81A-D407-4D97-BC87-6B141404F093@gmail.com> <4E97DD09.3000204@canterbury.ac.nz> <0E3AC970-A45B-44FF-A2B6-E4BFC5BA8040@gmail.com> Message-ID: <4E97E5F9.4090104@canterbury.ac.nz> Carl M. Johnson wrote: > That's more-or-less my point. The ability to write arbitrary code with given > will create a lot of arguments about style. One person will use it in one place > but not another, then someone will say, hey, let's make it more consistent, but > then that's going too far? and it becomes an attractive nuisance for pointless > rewrites. But again, is this really any different from what we already have? There are countless ways in which any given piece of code could be rewritten according to someone's personal tastes, yet somehow we manage to avoid the massive code churn that this would seem to imply. -- Greg From greg.ewing at canterbury.ac.nz Fri Oct 14 09:47:57 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Oct 2011 20:47:57 +1300 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <4E9751AF.4070705@canterbury.ac.nz> Message-ID: <4E97E92D.7010608@canterbury.ac.nz> Nick Coghlan wrote: > So, keep the PEP 3150 syntax, but don't make the inner suite special > aside from the out of order execution? That's right. If we're willing to accept the idea of the def in a postdef statement binding a name in the surrounding scope, then we've already decided that we don't care about polluting the main scope -- and it would make PEP 3150 a heck of a lot easier to both specify and implement. Having said that, I think there might be a way of implementing PEP 3150 with scoping and all that isn't too bad. The main difficulties seem to concern class scopes. Currently they're kept in a dict while they're being built, like function local scopes used to be before they were optimised. We could change that so that they're compiled in the same way as a normal function scope, and then use the equivalent of locals() at the end to build the class dict. The body of a 'given' statement could then be compiled as a separate function with access to the class scope. Nested 'def' and 'class' statements, on the other hand, would be compiled with the surrounding scope deliberately excluded. -- Greg From tjreedy at udel.edu Fri Oct 14 17:31:06 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 14 Oct 2011 11:31:06 -0400 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: Message-ID: On 10/12/2011 8:22 PM, Nick Coghlan wrote: > Basic Examples > ============== > > Before diving into the long history of this problem and the detailed > rationale for this specific proposed solution, here are a few simple > examples of the kind of code it is designed to simplify. > > As a trivial example, weakref callbacks could be defined as follows:: > > :x = weakref.ref(obj, @) > def report_destruction(obj): > print("{} is being destroyed".format(obj)) You have already revised the wretchedness of the above syntax, but... > This contrasts with the current repetitive "out of order" syntax for this > operation:: > def report_destruction(obj): > print("{} is being destroyed".format(obj)) > x = weakref.ref(obj, report_destruction) To me, 'define it, use it' is *in the proper order*. That you feel you have to masquerade an opinion or stylistic preference as a fact does not say much for the proposal. For this trivial example, the following works: x = weakref.ref(obj, lamdda x: print("{} is being destroyed".format(x))) Let call_wrapper(g) be a decorator that wraps f with a call to g. In other words, call_wrapper(g)(f) == g(f). Perhaps just def call_wrapper(g): def _(f): return g(f) return _ Then I believe the following, or something close, would work, in the sense that 'x' would end up being bound to the same thing as above, with no need for the throwaway one-use name: @call_wrapper(functools.partial(weakref.ref, obj)) def x(obj): print("{} is being destroyed".format(obj)) If one prefers, functool.partial could be avoided with def call_arg_wrap(g,arg): def _(f): return(g(arg,f) return _ @call_arg_wrap(weakref.ref, obj) def x(obj): print("{} is being destroyed".format(obj)) -- Terry Jan Reedy From ron3200 at gmail.com Fri Oct 14 18:30:38 2011 From: ron3200 at gmail.com (Ron Adam) Date: Fri, 14 Oct 2011 11:30:38 -0500 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: <4E97DEDB.3000003@canterbury.ac.nz> References: <4E9751AF.4070705@canterbury.ac.nz> <1318554600.460.83.camel@Gutsy> <1318571750.1494.114.camel@Gutsy> <4E97DEDB.3000003@canterbury.ac.nz> Message-ID: <1318609838.2782.18.camel@Gutsy> On Fri, 2011-10-14 at 20:03 +1300, Greg Ewing wrote: > Ron Adam wrote: > > > Class's are constructed top down. You create the framework and then > > fill in the parts. > > Actually, they're not -- the class object is created > *after* executing the class body! So this is a (small) > precedent for writing things out of order when it > helps. I fairly often will write classes by writing the class header, then write the methods headers, and then go back and fill in the method bodies. The mental model for solving a problem doesn't have to match the exact computational order. Cheers, Ron From ethan at stoneleaf.us Fri Oct 14 19:03:21 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 14 Oct 2011 10:03:21 -0700 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: <4E975C3C.1090707@canterbury.ac.nz> References: <2CEE38E9-AFC5-440F-B6EB-87D0403114AC@gmail.com> <4E975C3C.1090707@canterbury.ac.nz> Message-ID: <4E986B59.4030607@stoneleaf.us> Greg Ewing wrote: > Nick Coghlan wrote: >> With this variant, I would suggest that any postdef clause be executed >> *in addition* to the normal name binding. Colliding on dummy names >> like "func" would then be like colliding on loop variables like "i" - >> typically harmless, because you don't use the names outside the >> constructs that define them anyway. > > That sounds reasonable. However, if we're binding the name > anyway, why not use it in the main clause instead of introducing > something weird like a lone 'def'? > > postdef x = weakref.ref(obj, report_destruction): > def report_destruction(obj): > print("{} is being destroyed".format(obj)) +1 ~Ethan~ From ethan at stoneleaf.us Fri Oct 14 19:09:26 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 14 Oct 2011 10:09:26 -0700 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <20111013134530.5b539ca0@pitrou.net> Message-ID: <4E986CC6.70804@stoneleaf.us> Nick Coghlan wrote: > The update to the PEP that I just pushed actually drops class > statement support altogether (at least for now), but if it was still > there, the above example would instead look more like: > > postdef x = property(class.get, class.set, class.delete) > class scope: > def get(self): > return __class__.attr > def set(self, val): > __class__.attr = val > def delete(self): > del __class__.attr > I like Greg's proposal to use the throw-away name: postdef x = property(scope.get, scope.set, scope.delete) class scope: def get(self): return __class__.attr def set(self, val): __class__.attr = val def delete(self): del __class__.attr ~Ethan~ From ethan at stoneleaf.us Fri Oct 14 19:13:07 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 14 Oct 2011 10:13:07 -0700 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: <4E986CC6.70804@stoneleaf.us> References: <20111013134530.5b539ca0@pitrou.net> <4E986CC6.70804@stoneleaf.us> Message-ID: <4E986DA3.4080401@stoneleaf.us> Ethan Furman wrote: > Nick Coghlan wrote: >> The update to the PEP that I just pushed actually drops class >> statement support altogether (at least for now), but if it was still >> there, the above example would instead look more like: >> >> postdef x = property(class.get, class.set, class.delete) >> class scope: >> def get(self): >> return __class__.attr >> def set(self, val): >> __class__.attr = val >> def delete(self): >> del __class__.attr >> > > > I like Greg's proposal to use the throw-away name: > > postdef x = property(scope.get, scope.set, scope.delete) > class scope: > def get(self): > return __class__.attr > def set(self, val): > __class__.attr = val > def delete(self): > del __class__.attr Oh, DRY violation... drat. ~Ethan~ From guido at python.org Fri Oct 14 19:23:10 2011 From: guido at python.org (Guido van Rossum) Date: Fri, 14 Oct 2011 10:23:10 -0700 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> <20111013122706.GH6393@pantoffel-wg.de> Message-ID: We've been bikeshedding long enough. I propose to do the following to range() in Python 3.3: - add read-only attributes .start, .step, .stop - add slicing such that it normalizes .stop to .start + the right multiple of .step - add __eq__ and __hash__ which compare by .start, .step, .stop And no more. -- --Guido van Rossum (python.org/~guido) From anikom15 at gmail.com Fri Oct 14 20:15:38 2011 From: anikom15 at gmail.com (Westley =?iso-8859-1?Q?Mart=EDnez?=) Date: Fri, 14 Oct 2011 11:15:38 -0700 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: <20111012163144.GB6393@pantoffel-wg.de> References: <20111012163144.GB6393@pantoffel-wg.de> Message-ID: <20111014181538.GA9255@Smoke> On Wed, Oct 12, 2011 at 05:31:44PM +0100, Sven Marnach wrote: > There are circumstances, for example in unit testing, when it might be > useful to check if two range objects describe the same range. > Currently, this can't be done using the '==' operator: > > >>> range(5) == range(5) > False > > To get a useful comparison, you would either need to realise both > range objects as lists or use a function like > > def ranges_equal(r0, r1): > if not r0: > return not r1 > return len(r0) == len(r1) and r0[0] == r1[0] and r0[-1] == r1 [-1] > > All other built-in sequence types (that is bytearray, bytes, list, > str, and tuple) define equality by "all items of the sequence are > equal". I think it would be both more consistent and more useful if > range objects would pick up the same semantics. > > When implementing '==' and '!=' for range objects, it would be natural > to implement the other comparison operators, too (lexicographically, > as for all other sequence types). This change would be backwards > incompatible, but I very much doubt there is much code out there > relying on the current behaviour of considering two ranges as unequal > just because they aren't the same object (and this code could be > easily fixed by using 'is' instead of '=='). > > Opinions? > I'm -32767 on this whole idea. It's not obvious what comparing a range actually means, because a range is very abstract. If a developer needs to compare ranges they should write their own functions to do it in whatever way that fits their case. From sven at marnach.net Fri Oct 14 20:16:09 2011 From: sven at marnach.net (Sven Marnach) Date: Fri, 14 Oct 2011 19:16:09 +0100 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> <20111013122706.GH6393@pantoffel-wg.de> Message-ID: <20111014181608.GK6393@pantoffel-wg.de> Guido van Rossum schrieb am Fr, 14. Okt 2011, um 10:23:10 -0700: > - add slicing such that it normalizes .stop to .start + the right > multiple of .step That's what slicing already does now. This kind of normalisation still isn't enough to get an implementation of the sequence-based definition of equality, though. We would also need to set the step value to 1 in case the range has length 0 or 1. (Not that I'd propose to do the latter -- I mention it just to make clear that the steps you suggest don't allow for comparison of ranges as sequences in any easier way than currently possible.) Cheers, Sven From mikegraham at gmail.com Fri Oct 14 20:28:01 2011 From: mikegraham at gmail.com (Mike Graham) Date: Fri, 14 Oct 2011 14:28:01 -0400 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> Message-ID: On Wed, Oct 12, 2011 at 1:58 PM, Mark Dickinson wrote: > On Wed, Oct 12, 2011 at 6:33 PM, Steven D'Aprano > wrote: > > I don't agree. Equality makes sense for ranges: two ranges are equal if > they > > have the same start, stop and step values. > > Hmm. I'm not sure that it's that clear cut. The other possible > definition is that two ranges are equal if they're equal as lists. > For equality and comparison, this should be the standard. range objects are sequences, and they should compare just like other sequences. If implemented at all, equality should be that they have the same items in the same order. If implemented at all, comparison should be lexicographic. It seems to me you'd need a really good reason to have behavior different from every other sequence. Mike -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Fri Oct 14 20:28:46 2011 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 14 Oct 2011 12:28:46 -0600 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: <4E97E92D.7010608@canterbury.ac.nz> References: <4E9751AF.4070705@canterbury.ac.nz> <4E97E92D.7010608@canterbury.ac.nz> Message-ID: On Fri, Oct 14, 2011 at 1:47 AM, Greg Ewing wrote: > Nick Coghlan wrote: > >> So, keep the PEP 3150 syntax, but don't make the inner suite special >> aside from the out of order execution? > > That's right. If we're willing to accept the idea of the > def in a postdef statement binding a name in the surrounding > scope, then we've already decided that we don't care about > polluting the main scope -- and it would make PEP 3150 a > heck of a lot easier to both specify and implement. > > Having said that, I think there might be a way of > implementing PEP 3150 with scoping and all that isn't too > bad. > > The main difficulties seem to concern class scopes. > Currently they're kept in a dict while they're being > built, like function local scopes used to be before > they were optimised. > > We could change that so that they're compiled in the > same way as a normal function scope, and then use the > equivalent of locals() at the end to build the class > dict. The body of a 'given' statement could then be > compiled as a separate function with access to the > class scope. Nested 'def' and 'class' statements, on > the other hand, would be compiled with the surrounding > scope deliberately excluded. > > -- > Greg > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > Perhaps we should avenge 3150's mortal wounding then. I'd vote for this in-order usage: given: a = 1 b = 2 def f(a=given.a, b=given.b): ... Notice that the contents of the "given" block are exposed on a special "given" name; the name would only be available in the statement attached to the given clause. The idea of a one-shot anonymous block is a part of the PEP 403 idea that really struck me. (Perhaps it was already implicit in the 3150 concept, but this thread sold it for me.) It seems like the CPython implementation wouldn't be that tricky but I'm _sure_ I'm missing something. Make the given block compile like a class body, but stop there and locally bind the temporary "given" to that resulting dictionary. Any use of "given." would be treated as a key lookup on the dictionary. In the case of function definitions with a given clause, don't close on "given". Instead, make a cell for each "given." used in the function body and tie each use to that cell. This will help to avoid superfluous lookups on "given". I like the in-order variant because it's exactly how you would do it now, without the cleanup after: a = 1 b = 2 def f(a=a, b=b): ... del a, b This reminds me of how the with statement pulled the "after" portion out, and of decorators. Not only that, but when you write a module, don't you write the most dependent statements (definitions even) at the bottom? First you'll write that part you care about. Then you'll write the dependent code _above_, often because of execution (or definition) order dependencies. It's like Greg said (below), though my understanding is that the common convention is to put the factored out code above, not below. If I'm wrong then I would gladly hear about it. Greg Ewing said: > Nick Coghlan wrote: >> So is the "inline vs given statement" question really any more scary >> than the "should I factor this out into a function" question? > > Especially since the factored-out functions could be > written either before or after the place where they're > used, maybe not even on the same page, and maybe not > even in the same source file... Also, the in-order given statement is easy to following when reading code, while the post-order one is less so. Here are some examples refactored from other posts in this thread (using in-order): given: def report_destruction(obj): print("{} is being destroyed".format(obj)) x = weakref.ref(obj, given.report_destruction) given: def pointless(): """A pointless worker thread that does nothing except serve as an example""" print("Starting") time.sleep(3) print("Ending") t = threading.Thread(target=given.pointless); t.start() given: len = len def double_len(x): print(x) return 2 * given.len(x) given: part_a = source / 2 part_b = source ** 2 majiger = blender(part_a, part_b) thingie = doohickie(majiger) result = flange(given.thingie) # or given: given: given: part_a = source / 2 part_b = source ** 2 majiger = blender(given.part_a, given.part_b) thingie = doohickie(given.majiger) result = flange(given.thingie) When writing these, you would normally just write the statement, and then fill in the blanks above it. Using "given", if you want to anonymize any statement on which the original depends, you stick it in the given clause. It stays in the same spot that you had it before you put a given clause. It's where I would expect to find it: before it gets used. Here are some unknowns that I see in the idea, which have been brought up before: 1. How would decorators mix with given clauses on function/class definitions? (maybe disallow?) 2. How could you introspect the code inside the given clause? (same as code in a function body?) 3. Would it make sense to somehow inspect the actual anonymous namespace that results from the given clause? 4. For functions, would there be an ambiguity to resolve? For that last one, take a look at this example: given: x = 1 def f(): given: x = 2 return given.x My intuition is that the local given clause would supersede the closed one (i.e. the outer one would be rendered unused code and no closure would have been compiled for it). However, I could also see this as resulting in a SyntaxError. Nick Coghlan wrote: > If we want to revert back to using > an indented suite, than I think it makes more sense to go all the way > back to PEP 3150 and discuss the relative importance of "out of order > execution" and "private scope to avoid namespace pollution". I'm just not seeing that relative importance. Nick, you've mentioned it on several occasions. Is it the following, which you mentioned in the PEP? Python's demand that the function be named and introduced before the operation that needs it breaks the developer's flow of thought. I know for a fact that Nick knows a lot more than me (and has been at this a lot longer), so I assume that I'm missing something here. The big advantage of the post-order given statement, that I see, is that you can do a one-liner: x = [given.len(i) for i in somebiglist] given: len = len vs. given: len = len x = [given.len(i) for i in somebiglist] Nick said: > Interestingly, the main thing I'm getting out of this discussion is > more of an idea of why PEP 3150 has fascinated me for so long. I > expect the outcome is going to be that 403 gets withdrawn and 3150 > resuscitated :) So far this thread has done the same for me. I like where 403 has taken us. The relationship between decorators and the PEP 403 syntax is also a really clever connection! Very insightful too (a frame's stack as a one-off pseudo-scope). And I agree with the sentiments of Nick's expectation. :) My preference is for PEP 3150, a case for which I hope I've made above. And, after thinking about it, I like the simplicity of PEP 3150 better. It has more of that "executable pseudo-code" feel. While explaining PEP 403 to a co-worker (without much Python under his belt), I had to use the 3150 syntax to explain to him how the 403 syntax worked and what it meant. He found the 3150 syntax much easier to read and understand. "Why don't you just use _that_?", he asked. Incidentally, he said the in-order variant is easier to read. ;) -eric From guido at python.org Fri Oct 14 20:28:47 2011 From: guido at python.org (Guido van Rossum) Date: Fri, 14 Oct 2011 11:28:47 -0700 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: <20111014181608.GK6393@pantoffel-wg.de> References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> <20111013122706.GH6393@pantoffel-wg.de> <20111014181608.GK6393@pantoffel-wg.de> Message-ID: On Fri, Oct 14, 2011 at 11:16 AM, Sven Marnach wrote: > Guido van Rossum schrieb am Fr, 14. Okt 2011, um 10:23:10 -0700: >> - add slicing such that it normalizes .stop to .start + the right >> multiple of .step > > That's what slicing already does now. I guess I tested with an old version of Python 3... > This kind of normalisation > still isn't enough to get an implementation of the sequence-based > definition of equality, though. ?We would also need to set the step > value to 1 in case the range has length 0 or 1. ?(Not that I'd propose > to do the latter -- I mention it just to make clear that the steps you > suggest don't allow for comparison of ranges as sequences in any > easier way than currently possible.) Ok, so we don't have anything to do for slices. It doesn't change my opinion on the rest. -- --Guido van Rossum (python.org/~guido) From jimjjewett at gmail.com Fri Oct 14 21:01:03 2011 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri, 14 Oct 2011 15:01:03 -0400 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <4E9751AF.4070705@canterbury.ac.nz> <4E97E92D.7010608@canterbury.ac.nz> Message-ID: On Fri, Oct 14, 2011 at 2:28 PM, Eric Snow wrote: > On Fri, Oct 14, 2011 at 1:47 AM, Greg Ewing wrote: >> Nick Coghlan wrote: >>> So, keep the PEP 3150 syntax, but don't make the inner suite special >>> aside from the out of order execution? > ? given: > ? ? ? a = 1 > ? ? ? b = 2 > ? def f(a=given.a, b=given.b): > ? ? ? ... A few years ago, I would have liked this (except for the awkward repetition of given.*), because it would have felt like the function was cleaning up after itself. Even now, I like it, because the extra indentation make it clear that a and b group with f. Today, I do (or fail to do) that with vertical whitespace. But I'm not sure it answers the original concerns. > ... ? First you'll write that part you care about. ?Then > you'll write the dependent code _above_, It is precisely that going-back-up that we want to avoid, for readers as well as for writers. > Also, the in-order given statement is easy to following when reading > code, while the post-order one is less so. I like a Table of Contents. Sometimes I like an outline. I can follow a bottom-up program, but it is usually better if I can first see "Here's the main point", *followed*by* "Here's how to do it." > 1. How would decorators mix with given clauses on function/class > definitions? ?(maybe disallow?) I would assume that they work just like they do now -- put them on the line right before the definition, at the same indent level. That would be after the given clause semantically, and also literally if the given clause happens before the def. Having decorators come outside the given clause would be a problem, because the given statements don't always return anything, let alone the right function. Banning decorators would seem arbitrary enough that I would want to say "hmm, this isn't ready yet." > 2. How could you introspect the code inside the given clause? (same as > code in a function body?) Either add a .given node to the function object, or treat it like any other module-level (or class-level) code outside a function. > ? ?Python's demand that the function be named and introduced > ? ?before the operation that needs it breaks the developer's flow > ? ?of thought. I think of this a bit like C function prototypes and header files. You get used to the hassle of having to repeat the prototypes, but it is still a bad thing. Having the prototypes as a quick Table Of Contents or Index, on the other hand, is a good thing. > big advantage of the post-order given statement, that I see, is that > you can do a one-liner: Nah; if it really is a one-liner, then moving the definition up a line isn't that expensive. The big advantage is when something is a level of detail that you don't want to focus on yet, but it is also (textually) big enough to cause a mental interruption. -jJ From alexander.belopolsky at gmail.com Fri Oct 14 21:52:18 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri, 14 Oct 2011 15:52:18 -0400 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> <20111013122706.GH6393@pantoffel-wg.de> Message-ID: On Fri, Oct 14, 2011 at 1:23 PM, Guido van Rossum wrote: .. > - add read-only attributes .start, .step, .stop > - add slicing such that it normalizes .stop to .start + the right > multiple of .step > - add __eq__ and __hash__ which compare by .start, .step, .stop -1 I did not see a clear statement of a use-case for any of these features. I could imagine convenience of __eq__ for those used to range() returning a list, but comparing by .start, .step, .stop would destroy this convenience. If you need an object with .start, .step, .stop, we already have the slice object. NumPy has some functionality to create a regular sequence from a slice object. I don't see why someone would need a __hash__. If you want to key some values by ranges, just use 3-tuples instead. From ethan at stoneleaf.us Fri Oct 14 22:10:59 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 14 Oct 2011 13:10:59 -0700 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> <20111013122706.GH6393@pantoffel-wg.de> Message-ID: <4E989753.80509@stoneleaf.us> Guido van Rossum wrote: > We've been bikeshedding long enough. I propose to do the following to > range() in Python 3.3: > > - add read-only attributes .start, .step, .stop +1 > - add slicing such that it normalizes .stop to .start + the right > multiple of .step Already in place. > - add __eq__ and __hash__ which compare by .start, .step, .stop -1 --> lst1 = [x for x in [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] if x % 3 == 0] --> lst2 = [x for x in [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] if x % 3 == 0] --> lst3 = [x for x in [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11] if x % 3 == 0] --> lst1 [0, 3, 6, 9] --> lst2 [0, 3, 6, 9] --> lst3 [0, 3, 6, 9] --> lst1 == lst2 == lst3 True A range is a sequence -- why should identical sequences not compare equal? If you have a case where the start, stop, and step do make a difference then that should be the special case where you write your own custom code. Mike Graham wrote: > For equality and comparison, this should be the standard. range > objects are sequences, and they should compare just like other > sequences. If implemented at all, equality should be that they have > the same items in the same order. +1 ~Ethan~ From guido at python.org Fri Oct 14 22:43:23 2011 From: guido at python.org (Guido van Rossum) Date: Fri, 14 Oct 2011 13:43:23 -0700 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: <4E989753.80509@stoneleaf.us> References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> <20111013122706.GH6393@pantoffel-wg.de> <4E989753.80509@stoneleaf.us> Message-ID: On Fri, Oct 14, 2011 at 1:10 PM, Ethan Furman wrote: > A range is a sequence -- why should identical sequences not compare equal? There's no such convention anywhere in Python. (1, 2) != [1, 2]. collections/abc.py does not define __eq__ for sequences. Have you personally written code that compares ranges? -- --Guido van Rossum (python.org/~guido) From ethan at stoneleaf.us Fri Oct 14 22:58:44 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 14 Oct 2011 13:58:44 -0700 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> <20111013122706.GH6393@pantoffel-wg.de> <4E989753.80509@stoneleaf.us> Message-ID: <4E98A284.90507@stoneleaf.us> Guido van Rossum wrote: > On Fri, Oct 14, 2011 at 1:10 PM, Ethan Furman wrote: >> A range is a sequence -- why should identical sequences not compare equal? > > There's no such convention anywhere in Python. (1, 2) != [1, 2]. > collections/abc.py does not define __eq__ for sequences. Okay, add in 'of the same type'. > Have you personally written code that compares ranges? I have not. Nevertheless, I expect a sequence object of type SomeType that returns the identical items, in the same order, as another sequence object of type SomeType to compare equal to that other sequence object no matter how the two objects happened to be created. ~Ethan~ From ericsnowcurrently at gmail.com Fri Oct 14 23:23:36 2011 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 14 Oct 2011 15:23:36 -0600 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <4E9751AF.4070705@canterbury.ac.nz> <4E97E92D.7010608@canterbury.ac.nz> Message-ID: On Fri, Oct 14, 2011 at 1:01 PM, Jim Jewett wrote: > On Fri, Oct 14, 2011 at 2:28 PM, Eric Snow wrote: >> On Fri, Oct 14, 2011 at 1:47 AM, Greg Ewing wrote: >>> Nick Coghlan wrote: > >>>> So, keep the PEP 3150 syntax, but don't make the inner suite special >>>> aside from the out of order execution? > >> ? given: >> ? ? ? a = 1 >> ? ? ? b = 2 >> ? def f(a=given.a, b=given.b): >> ? ? ? ... > > A few years ago, I would have liked this (except for the awkward > repetition of given.*), because it would have felt like the function > was cleaning up after itself. > > Even now, I like it, because the extra indentation make it clear that > a and b group with f. ?Today, I do (or fail to do) that with vertical > whitespace. > > But I'm not sure it answers the original concerns. > >> ... ? First you'll write that part you care about. ?Then >> you'll write the dependent code _above_, > > It is precisely that going-back-up that we want to avoid, for readers > as well as for writers. I see what you mean. And when you are interacting with an object from an outer scope then it doesn't matter where the object was defined, as long as it happened execution-wise before you go to use it. However, in the same block, the object you are going to use must be defined _above_, since execution flows down. For example, a module will have "if __name__ == "__main__" at the bottom, if it has one. I guess I'm just used to code generally following that same pattern, even when not semantically necessary. In-order just _seems_ so natural and easy to understand. But on the other hand, I was able to figure out other out-of-order syntax in Python... Perhaps I simply need to adjust to out-of-order syntax. Out-of-order doesn't feel nearly as natural to me, but that may only be the consequence of my experience and not reflect its value. Presently I can only defer to others with more experience here to make a case for or against it. In the meantime, the in-order variant seems to me to be the better choice. > >> Also, the in-order given statement is easy to following when reading >> code, while the post-order one is less so. > > I like a Table of Contents. ?Sometimes I like an outline. ?I can > follow a bottom-up program, but it is usually better if I can first > see "Here's the main point", *followed*by* "Here's how to do it." > >> 1. How would decorators mix with given clauses on function/class >> definitions? ?(maybe disallow?) > > I would assume that they work just like they do now -- put them on the > line right before the definition, at the same indent level. ?That > would be after the given clause semantically, and also literally if > the given clause happens before the def. > > Having decorators come outside the given clause would be a problem, > because the given statements don't always return anything, let alone > the right function. > > Banning decorators would seem arbitrary enough that I would want to > say "hmm, this isn't ready yet." > >> 2. How could you introspect the code inside the given clause? (same as >> code in a function body?) > > Either add a .given node to the function object, or treat it like any > other module-level (or class-level) code outside a function. > >> ? ?Python's demand that the function be named and introduced >> ? ?before the operation that needs it breaks the developer's flow >> ? ?of thought. > > I think of this a bit like C function prototypes and header files. > You get used to the hassle of having to repeat the prototypes, but it > is still a bad thing. ?Having the prototypes as a quick Table Of > Contents or Index, on the other hand, is a good thing. > Code as a table of contents...hadn't considered it. It's an interesting idea. For large modules I've seen it mostly done in the module docstring, rather than in code. When I need a table of contents for some code I'll usually use dir(), help(), dedicated documentation, or docstrings directly (in that order). When I code I usually keep the number of objects in any given block pretty small, thus it's easy to scan through. But with other (equally/more valid) coding styles than mine (and other languages) I can see where code as a table of contents at the beginning would be helpful. >> big advantage of the post-order given statement, that I see, is that >> you can do a one-liner: > > Nah; if it really is a one-liner, then moving the definition up a line > isn't that expensive. ?The big advantage is when something is a level > of detail that you don't want to focus on yet, but it is also > (textually) big enough to cause a mental interruption. That mental interruption is definitely to be avoided. In those cases I typically factor that code out into a function. Then the high-level execution is obvious in one place. I'm not seeing how a given statement (of either variant) would be helpful there; and I don't see how it would be an interruption in places I would use it. Thanks for the insights. I definitely want to get on the same page (either way) with those promoting the post-order given statement. I'm glad that Nick wrote PEP 403 because it's gotten me thinking about this. Regardless of the variant, I think the idea of one-off anonymous namespaces is worth figuring out. -eric > > -jJ > From greg.ewing at canterbury.ac.nz Sat Oct 15 02:02:19 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 15 Oct 2011 13:02:19 +1300 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <4E9751AF.4070705@canterbury.ac.nz> <4E97E92D.7010608@canterbury.ac.nz> Message-ID: <4E98CD8B.1000706@canterbury.ac.nz> Eric Snow wrote: > I'd > vote for this in-order usage: > > given: > a = 1 > b = 2 > def f(a=given.a, b=given.b): > ... You seem to be taking the view that hiding the names in an inner scope is the most important part of PEP 3150, but I've never seen it that way. > I like the in-order variant because it's exactly how you would do it > now, without the cleanup after: > > a = 1 > b = 2 > def f(a=a, b=b): > ... > del a, b But the assumption that "in-order is better" is the very thing under dispute. The entire reason for the existence of PEP 3150 and/or 403 is that some people believe it's not always true. > Also, the in-order given statement is easy to following when reading > code, while the post-order one is less so. I suspect we may be misleading ourselves by looking at artificially meaningless examples. If meaningful names are chosen for the intermediate values, I think it can help readability to put the top level first and relegate the details to an indented block. gross_value = net_value + gst_amount given: gst_amount = net_value * (gst_rate / 100.0) I can't see how it hurts to put the subsidiary details after the big picture, any more than it hurts to write your main() function at the top and other things it calls after it. -- Greg From ron3200 at gmail.com Sat Oct 15 02:46:07 2011 From: ron3200 at gmail.com (Ron Adam) Date: Fri, 14 Oct 2011 19:46:07 -0500 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <4E9751AF.4070705@canterbury.ac.nz> <4FB2E81A-D407-4D97-BC87-6B141404F093@gmail.com> Message-ID: <1318639567.2215.11.camel@Gutsy> On Fri, 2011-10-14 at 13:38 +1000, Nick Coghlan wrote: > All PEP 3150 is allowing you to do is indent stuff that could > potentially be factored out into a function at some point, without > forcing you to factor it out *right now*: > > result = flange(thingie) given: > thingie = doohickie(majiger) given: > part_a = source / 2 > part_b = source ** 2 > majiger = blender(part_a, part_b) I don't see it as much advantage for just a few lines of code where you can see the whole thing at once both visually and mentally. # determine source ... # Get flange result part_a = source / 2 part_b = source ** 2 majiger = blender(part_a, part_b) thingie = doohickie(majiger) result = flange(thingie) # do something with flange result ... With comments, I find that it's just as readable, mostly because it's a small enough chunk that the order isn't all that important. If the block of code was one used over again outside that context it would be factored out and you would have this. You could factor it out anyway if it helps make the surrounding code more efficient. # Determine source. ... # Get flange result = get_flange(source) # Do something with flange result ... For very large blocks of code, you might end up with many levels of extra indentation depending on how much nesting there is. To be clear. It's not a bad concept. It's a very nice way to express some problems. I'm just not sure it's needed in Python. On the other hand... The waiting for a block to complete to continue a statement reminds me of futures. http://en.wikipedia.org/wiki/Future_%28programming% 29#Implicit_vs_explicit It would be nice if something like this was also compatible with micro threading. I bet it would be a sure win-win if we could get that from this. (or at least a possibility of it.) Cheers, Ron From tjreedy at udel.edu Sat Oct 15 04:07:16 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 14 Oct 2011 22:07:16 -0400 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> <20111013122706.GH6393@pantoffel-wg.de> Message-ID: On 10/14/2011 1:23 PM, Guido van Rossum wrote: > We've been bikeshedding long enough. I propose to do the following to > range() in Python 3.3: > > - add read-only attributes .start, .step, .stop > - add slicing such that it normalizes .stop to .start + the right > multiple of .step > - add __eq__ and __hash__ which compare by .start, .step, .stop I have sometimes thought that we should unify slice and range objects by either adding .__iter__ to slice objects or adding the necessary attributes to range objects. The proposal above comes close to doing the latter. I presume that slice.__eq__ does what you propose for range. All that would be missing from range is the slice.indices method. Both range and slice objects represent a virtual subseqeunce of ints with the same three attributes. We use one to explicitly iterate. We use the other to select subsequences with an internal iteration. If range.stop were allowed to be None, as is slice.stop, we also would not need itertools.count, which is the third way we represent a virtual stepped subsequence of ints. -- Terry Jan Reedy From anacrolix at gmail.com Sat Oct 15 07:19:50 2011 From: anacrolix at gmail.com (Matt Joiner) Date: Sat, 15 Oct 2011 16:19:50 +1100 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> <20111013122706.GH6393@pantoffel-wg.de> Message-ID: +0 =) On Sat, Oct 15, 2011 at 1:07 PM, Terry Reedy wrote: > On 10/14/2011 1:23 PM, Guido van Rossum wrote: >> >> We've been bikeshedding long enough. I propose to do the following to >> range() in Python 3.3: >> >> - add read-only attributes .start, .step, .stop >> - add slicing such that it normalizes .stop to .start + the right >> multiple of .step >> - add __eq__ and __hash__ which compare by .start, .step, .stop > > I have sometimes thought that we should unify slice and range objects by > either adding .__iter__ to slice objects or adding the necessary attributes > to range objects. The proposal above comes close to doing the latter. I > presume that slice.__eq__ does what you propose for range. All that would be > missing from range is the slice.indices method. > > Both range and slice objects represent a virtual subseqeunce of ints with > the same three attributes. We use one to explicitly iterate. We use the > other to select subsequences with an internal iteration. > > If range.stop were allowed to be None, as is slice.stop, we also would not > need itertools.count, which is the third way we represent a virtual stepped > subsequence of ints. > > -- > Terry Jan Reedy > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From ncoghlan at gmail.com Sat Oct 15 09:04:54 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 15 Oct 2011 17:04:54 +1000 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> <20111013122706.GH6393@pantoffel-wg.de> Message-ID: On Sat, Oct 15, 2011 at 5:52 AM, Alexander Belopolsky wrote: > On Fri, Oct 14, 2011 at 1:23 PM, Guido van Rossum wrote: > .. >> - add read-only attributes .start, .step, .stop >> - add slicing such that it normalizes .stop to .start + the right >> multiple of .step >> - add __eq__ and __hash__ which compare by .start, .step, .stop > > -1 > > I did not see a clear statement of a use-case for any of these > features. ?I could imagine convenience of __eq__ for those used to > range() returning a list, but comparing by .start, .step, .stop would > destroy this convenience. ?If you need an object with .start, .step, > .stop, we already have the slice object. ?NumPy has some functionality > to create a regular sequence from a slice object. ?I don't see why > someone would need a __hash__. ?If you want to key some values by > ranges, just use 3-tuples instead. The key point here is that you can *already* invoke '==' and 'hash()' on 3.x ranges - they just have useless identity based semantics. The proposal is merely to make the semantics less pointless for something you can already do. It's also a potential step in the ongoing evolution of ranges towards being more like an optimised tuple of integers (but see my final comment to Guido below). The question is how to define the equivalence classes. There are 3 possible sets of equivalence classes available. In order of increasing size, they are: 1. Identity based (status quo): each range object is equal only to itself 2. Definition based: range objects are equal if their start, stop and step values are equal 3. Behaviour based: range objects are equal if they produce the same sequence of values when iterated over Definitions 2 and 3 produce identical equivalence classes for all non-empty sequences with a step value of 1 (or -1). They only diverge when the sequence is empty or the magnitude of the step value exceeds 1. Under definition 3, all empty ranges form an equivalence class, so "range(1, 1) == range(2, 2)", just like "(0, 1, 2)[1:1] == (0, 1, 2)[2:2]". Under definition 2, the start/stop/step values matter. Under definition 3, all ranges that produces the same output (e.g. just their start value) form an equivalence class, so "range(1, 2, 2) == range(1, 0, -2)" just like "(0, 1, 2)[1:2:2] == (0, 1, 2)[1:0:-2]". As with empty ranges, under definition 2, the start/stop/step values matter. I'll note that under definition 3 (but with start/stop/step exposed), it is easy and intuitive to implement definition 2 semantics: "lhs.start, lhs,stop, lhs.step == rhs.start, rhs.stop, rhs.step" By contrast, under definition 2, implementing definition 3 requires the same contortions as it does now: "len(lhs) == len(rhs) and lhs[0:1] == rhs[0:1] and lhs[-1:] == rhs[-1:]" Guido, I know you wanted to kill this discussion by declaring that definition 2 was the way to go, but I *like* the fact that we've been moving towards a "memory efficient tuple of regularly spaced integers" interaction model for 3.x range objects, and comparison semantics based on exact start/stop/step values would be a definitive break from that model. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sat Oct 15 09:52:17 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 15 Oct 2011 17:52:17 +1000 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> <20111013122706.GH6393@pantoffel-wg.de> Message-ID: On Sat, Oct 15, 2011 at 12:07 PM, Terry Reedy wrote: > If range.stop were allowed to be None, as is slice.stop, we also would not > need itertools.count, which is the third way we represent a virtual stepped > subsequence of ints. No, you don't want to do that - being finite is an important property of range objects. The thing about slice objects is that they're deliberately incomplete - you need to supply a sequence length in order to "realise" them. This is done via slice.indices(container_len) Now, *there's* a powerful use case in favour of making 3.x range behave just like a tuple of integers: we could update slice.indices() to return one of those instead of wastefully creating the full tuple of indices in memory the way it does now. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ubershmekel at gmail.com Sat Oct 15 12:58:37 2011 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Sat, 15 Oct 2011 06:58:37 -0400 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> <20111013122706.GH6393@pantoffel-wg.de> Message-ID: On Oct 15, 2011 3:08 AM, "Nick Coghlan" wrote: > > On Sat, Oct 15, 2011 at 5:52 AM, Alexander Belopolsky > wrote: > > On Fri, Oct 14, 2011 at 1:23 PM, Guido van Rossum wrote: > > .. > >> - add read-only attributes .start, .step, .stop > >> - add slicing such that it normalizes .stop to .start + the right > >> multiple of .step > >> - add __eq__ and __hash__ which compare by .start, .step, .stop > > > > -1 > > > > I did not see a clear statement of a use-case for any of these > > features. I could imagine convenience of __eq__ for those used to > > range() returning a list, but comparing by .start, .step, .stop would > > destroy this convenience. If you need an object with .start, .step, > > .stop, we already have the slice object. NumPy has some functionality > > to create a regular sequence from a slice object. I don't see why > > someone would need a __hash__. If you want to key some values by > > ranges, just use 3-tuples instead. > > The key point here is that you can *already* invoke '==' and 'hash()' > on 3.x ranges - they just have useless identity based semantics. The > proposal is merely to make the semantics less pointless for something > you can already do. > > It's also a potential step in the ongoing evolution of ranges towards > being more like an optimised tuple of integers (but see my final > comment to Guido below). > > The question is how to define the equivalence classes. There are 3 > possible sets of equivalence classes available. In order of increasing > size, they are: > > 1. Identity based (status quo): each range object is equal only to itself > 2. Definition based: range objects are equal if their start, stop and > step values are equal > 3. Behaviour based: range objects are equal if they produce the same > sequence of values when iterated over > Option 4 would be to kill the __eq__ method for range objects. I think 3 or 4 are the best alternatives. Since we still haven't seen any use cases, option #4 seems like the bikeshed stopper. --Yuval -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Oct 15 16:28:05 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 16 Oct 2011 00:28:05 +1000 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> <20111013122706.GH6393@pantoffel-wg.de> Message-ID: I have a use case now: switching slice.indices() to return a range object instead of a tuple. That heavily favours the 'behave like a sequence' approach. -- Nick Coghlan (via Gmail on Android, so likely to be more terse than usual) On Oct 15, 2011 8:58 PM, "Yuval Greenfield" wrote: > > On Oct 15, 2011 3:08 AM, "Nick Coghlan" wrote: > > > > On Sat, Oct 15, 2011 at 5:52 AM, Alexander Belopolsky > > wrote: > > > On Fri, Oct 14, 2011 at 1:23 PM, Guido van Rossum > wrote: > > > .. > > >> - add read-only attributes .start, .step, .stop > > >> - add slicing such that it normalizes .stop to .start + the right > > >> multiple of .step > > >> - add __eq__ and __hash__ which compare by .start, .step, .stop > > > > > > -1 > > > > > > I did not see a clear statement of a use-case for any of these > > > features. I could imagine convenience of __eq__ for those used to > > > range() returning a list, but comparing by .start, .step, .stop would > > > destroy this convenience. If you need an object with .start, .step, > > > .stop, we already have the slice object. NumPy has some functionality > > > to create a regular sequence from a slice object. I don't see why > > > someone would need a __hash__. If you want to key some values by > > > ranges, just use 3-tuples instead. > > > > The key point here is that you can *already* invoke '==' and 'hash()' > > on 3.x ranges - they just have useless identity based semantics. The > > proposal is merely to make the semantics less pointless for something > > you can already do. > > > > It's also a potential step in the ongoing evolution of ranges towards > > being more like an optimised tuple of integers (but see my final > > comment to Guido below). > > > > The question is how to define the equivalence classes. There are 3 > > possible sets of equivalence classes available. In order of increasing > > size, they are: > > > > 1. Identity based (status quo): each range object is equal only to itself > > 2. Definition based: range objects are equal if their start, stop and > > step values are equal > > 3. Behaviour based: range objects are equal if they produce the same > > sequence of values when iterated over > > > > Option 4 would be to kill the __eq__ method for range objects. > > I think 3 or 4 are the best alternatives. Since we still haven't seen any > use cases, option #4 seems like the bikeshed stopper. > > --Yuval > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sat Oct 15 18:47:00 2011 From: guido at python.org (Guido van Rossum) Date: Sat, 15 Oct 2011 09:47:00 -0700 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> <20111013122706.GH6393@pantoffel-wg.de> Message-ID: On Sat, Oct 15, 2011 at 12:04 AM, Nick Coghlan wrote: > On Sat, Oct 15, 2011 at 5:52 AM, Alexander Belopolsky > wrote: >> On Fri, Oct 14, 2011 at 1:23 PM, Guido van Rossum wrote: >> .. >>> - add read-only attributes .start, .step, .stop >>> - add slicing such that it normalizes .stop to .start + the right >>> multiple of .step >>> - add __eq__ and __hash__ which compare by .start, .step, .stop >> >> -1 >> >> I did not see a clear statement of a use-case for any of these >> features. ?I could imagine convenience of __eq__ for those used to >> range() returning a list, but comparing by .start, .step, .stop would >> destroy this convenience. ?If you need an object with .start, .step, >> .stop, we already have the slice object. ?NumPy has some functionality >> to create a regular sequence from a slice object. ?I don't see why >> someone would need a __hash__. ?If you want to key some values by >> ranges, just use 3-tuples instead. > > The key point here is that you can *already* invoke '==' and 'hash()' > on 3.x ranges - they just have useless identity based semantics. The > proposal is merely to make the semantics less pointless for something > you can already do. > > It's also a potential step in the ongoing evolution of ranges towards > being more like an optimised tuple of integers (but see my final > comment to Guido below). > > The question is how to define the equivalence classes. There are 3 > possible sets of equivalence classes available. In order of increasing > size, they are: > > 1. Identity based (status quo): each range object is equal only to itself > 2. Definition based: range objects are equal if their start, stop and > step values are equal > 3. Behaviour based: range objects are equal if they produce the same > sequence of values when iterated over > > Definitions 2 and 3 produce identical equivalence classes for all > non-empty sequences with a step value of 1 (or -1). They only diverge > when the sequence is empty or the magnitude of the step value exceeds > 1. > > Under definition 3, all empty ranges form an equivalence class, so > "range(1, 1) == range(2, 2)", just like "(0, 1, 2)[1:1] == (0, 1, > 2)[2:2]". Under definition 2, the start/stop/step values matter. > > Under definition 3, all ranges that produces the same output (e.g. > just their start value) form an equivalence class, so "range(1, 2, 2) > == range(1, 0, -2)" just like "(0, 1, 2)[1:2:2] == (0, 1, 2)[1:0:-2]". > As with empty ranges, under definition 2, the start/stop/step values > matter. > > I'll note that under definition 3 (but with start/stop/step exposed), > it is easy and intuitive to implement definition 2 semantics: > "lhs.start, lhs,stop, lhs.step == rhs.start, rhs.stop, rhs.step" > > By contrast, under definition 2, implementing definition 3 requires > the same contortions as it does now: "len(lhs) == len(rhs) and > lhs[0:1] == rhs[0:1] and lhs[-1:] == rhs[-1:]" > > Guido, I know you wanted to kill this discussion by declaring that > definition 2 was the way to go, but I *like* the fact that we've been > moving towards a "memory efficient tuple of regularly spaced integers" > interaction model for 3.x range objects, and comparison semantics > based on exact start/stop/step values would be a definitive break from > that model. Ok, you've convinced me on range() equality. If I want to compare the start/stop/step triple I can just extract those values and compare those. I remember in the past thinking about unifying slice() and range() and I couldn't do it. I still can't. I think they should remain separate. -- --Guido van Rossum (python.org/~guido) From alexander.belopolsky at gmail.com Sat Oct 15 19:04:50 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 15 Oct 2011 13:04:50 -0400 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> <20111013122706.GH6393@pantoffel-wg.de> Message-ID: On Sat, Oct 15, 2011 at 12:47 PM, Guido van Rossum wrote: .. > I remember in the past thinking about unifying slice() and range() and > I couldn't do it. I still can't. I think they should remain separate. One of the issues with slices is that they are deliberately made unhashable to prevent slicing of dictionaries. I am not sure range() objects need to be hashable. To me they are more like *lists* of equally spaced integers rather than *tuples*. (My reasons are not strong, but FWIW they are: (1) tuples are usually containers of heterogeneous objects and regular sequences are lists; and (2) 2.x range() (not xrange()) returns a list rather than a tuple.) On the other hand, making range() objects hashable will put an end to requests for writable .start, .stop, .step. From alexander.belopolsky at gmail.com Sat Oct 15 19:21:26 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 15 Oct 2011 13:21:26 -0400 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> <20111013122706.GH6393@pantoffel-wg.de> Message-ID: On Sat, Oct 15, 2011 at 10:28 AM, Nick Coghlan wrote: > I have a use case now: switching slice.indices() to return a range object > instead of a tuple. That heavily favours the 'behave like a sequence' > approach. I like the idea, but why is this a use-case for range.__eq__ or range.__hash__? I like the idea not because it will lead to any optimisation. Slice.indices() does not return a tuple containing all indices in a slice. The result is always a 3-tuple containing normalized start, stop, and step. A range object cannot be more efficient than a 3-tuple. I still like the idea because it would make indices() return what the name suggests - the sequence of indices selectable by the slice. From greg.ewing at canterbury.ac.nz Sun Oct 16 01:39:34 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 16 Oct 2011 12:39:34 +1300 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> <20111013122706.GH6393@pantoffel-wg.de> Message-ID: <4E9A19B6.1030300@canterbury.ac.nz> Alexander Belopolsky wrote: > (1) tuples are usually containers of > heterogeneous objects and regular sequences are lists; and (2) 2.x > range() (not xrange()) returns a list rather than a tuple.) But, conceptually, hashability has nothing to do with the homogeneous/heterogeneous distinction. The fact that tuples conflate them is a historical oddity. -- Greg From ericsnowcurrently at gmail.com Sun Oct 16 07:20:12 2011 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sat, 15 Oct 2011 23:20:12 -0600 Subject: [Python-ideas] Long Live PEP 3150 (was: Re: Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403)) Message-ID: On Fri, Oct 14, 2011 at 12:28 PM, Eric Snow wrote: > I know for a fact that Nick knows a lot more than me (and has been at > this a lot longer), so I assume that I'm missing something here. ?The > big advantage of the post-order given statement, that I see, is that > you can do a one-liner: > > ? x = [given.len(i) for i in somebiglist] given: len = len > > vs. > > ? given: len = len > ? x = [given.len(i) for i in somebiglist] After Nick's update to PEP 3150 I saw the post-order light (sort of). If you restrict the given clause to just simple statements, as the PEP does, the post-order variant actually makes more sense. The given clause for simple statements is like giving a suite to all the statements that don't have one[1]. The original statement is then the header for the subsequent block. I like that. If the new syntax were exclusive to simple statements then that's a good fit. I still prefer the in-order variant for compound statements though (they already have their own suite). If PEP 3150 were to march ahead with post-order, we probably couldn't add in-order given clauses for compound statements later, could we? Does it matter? -eric [1] http://mail.python.org/pipermail/python-ideas/2011-April/009891.html From ncoghlan at gmail.com Sun Oct 16 09:16:40 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 16 Oct 2011 17:16:40 +1000 Subject: [Python-ideas] Long Live PEP 3150 (was: Re: Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403)) In-Reply-To: References: Message-ID: Since I didn't get around to posting my own announcement email, I'll just note this here: As suggested by Eric's change to the subject line, I've now withdrawn the short-lived PEP 403, revived PEP 3150 (statement local namespaces) and updated it based on the feedback received in relation to PEP 403. The current draft of PEP 3150 is available on python.org: http://www.python.org/dev/peps/pep-3150/ On Sun, Oct 16, 2011 at 3:20 PM, Eric Snow wrote: > If the new syntax were exclusive to simple statements then that's a > good fit. ?I still prefer the in-order variant for compound statements > though (they already have their own suite). ?If PEP 3150 were to march > ahead with post-order, we probably couldn't add in-order given clauses > for compound statements later, could we? ?Does it matter? Not really, because you can embed arbitrary compound statements inside a PEP 3150 style "given" clause if you really want to. If we ever added "given" clauses to compound statements, I'd actually suggest we do it selectively in the respective header lines, assigning scoping semantics that are appropriate for the affected statements. For example: # Embedded assignments in if statements if match is not None given match=re.search(pattern, text): # process match else: # handle case where regex is not None # Embedded assignments in while loops while match is not None given match=re.search(pattern, text): # process match else: # handle case where regex is not None # Shared state for functions def accumulator(x) given tally=0: tally += x return x We may also decide to eliminate the "new scope" implications for 'given' statements entirely and focus solely on the "out of order execution" aspect. That would not only simplify the implementation, but also make for a cleaner extension to the compound statement headers. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From steve at pearwood.info Sun Oct 16 13:01:56 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 16 Oct 2011 22:01:56 +1100 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <4E9751AF.4070705@canterbury.ac.nz> <1318554600.460.83.camel@Gutsy> Message-ID: <4E9AB9A4.9090905@pearwood.info> Nick Coghlan wrote: > The named function version fails because it gets things out of order: > > def sort_key(item): > return item.attr1, item.attr2 > > sorted_list = sorted(original, key=sort_key) > > That's more like pseudo code for "First, define a function that > returns an object's attr1 and attr2 values. Than use that function to > sort our list", a far cry from the originally requested operation. I disagree strongly that the above "fails" in any sense, or that it is out of order at all. In English, one might say: Given such-and-such a definition of sort_key, sorted_list is calculated from sorted(original, key=sort_key). In my opinion, it is *much* more natural to state your premises (the givens) first, rather than after the conclusion. First you catch the rabbit, then you make it into stew, not the other way around. Furthermore, by encouraging (or forcing) the developer to define sort_key ahead of time, it encourages the developer to treat it seriously, as a real function, and document it and test it. If it's not tested, how do you know it does what you need? [...] > No, no, no - this focus on reusability is *exactly* the problem. It's > why callback programming in Python sucks - we force people to treat > one-shot functions as if they were reusable ones, *even when those > functions aren't going to be reused in any other statement*. That's not a bug, that's a feature. If you're not testing your callback, how do you know they do what you expect? Because they're so trivial that you can just "see" that they're correct? In that case, we have lambda, we don't need any new syntax. > That's the key realisation that I finally came to in understanding the > appeal of multi-line lambdas (via Ruby's block syntax): functions > actually have two purposes in life. The first is the way they're > traditionally presented: as a way to structure algorithms into > reusable chunks, so you don't have to repeat yourself. However, the > second is to simply hand over a section of an algorithm to be executed > by someone else. You don't *care* about reusability in those cases - > you care about handing the code you're currently writing over to be > executed by some other piece of code. Python only offers robust syntax > for the first use case, which is always going to cause mental friction > when you're only interested in the latter aspect. You have missed at least two more critical purposes for functions: - Documentation: both adding documentation to functions, and self-documenting via the function name. There's little mystery of what function sort_key is *supposed* to do (although the name is a bit light on the details), while: :sorted_lost = sorted(original, key=@) return item.attr1, item.attr2 is a magic incantation that is indecipherable unless you know the secret. (I realise that the above may not be your preferred or final suggested syntax.) - And testing. If code isn't tested, you should assume it is buggy. In an ideal world, there should never be any such thing as code that's used once: it should always be used at least twice, once in the application and once in the test suite. I realise that in practice we often fall short of that ideal, but we don't need more syntax that *encourages* developers to fail to test non-trivial code blocks. -- Steven From ncoghlan at gmail.com Sun Oct 16 14:18:35 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 16 Oct 2011 22:18:35 +1000 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: <4E9AB9A4.9090905@pearwood.info> References: <4E9751AF.4070705@canterbury.ac.nz> <1318554600.460.83.camel@Gutsy> <4E9AB9A4.9090905@pearwood.info> Message-ID: On Sun, Oct 16, 2011 at 9:01 PM, Steven D'Aprano wrote: > If it's not tested, how do you know it does what you need? Because my script does what I want. Not every Python script is a production application. We lose sight of that point at our peril. Systems administrators, lawyers, financial analysts, mathematicians, scientists... they're not writing continuously integrated production code, they're not writing reusable modules, they're writing scripts to *get things done*. Python's insistence that every symbol be defined before it can be referenced gets in the way of that. Imagine how annoying Python would be if we got rid of while loops and insisted that anyone want to implement a while loop instead write a custom iterator to use with a for loop instead. It's taken me a long time to realise it, but that's exactly how the lack of multi-line lambda support can feel. PEP 403 was an attempt to address *just* that problem with an approach inspired by Ruby's blocks. That discussion convinced me that such a terse, unintuitive idiom could never be made "Pythonic" - but there's still hope for the more verbose, yet also more expressive, statement local namespace concept. Will there be times where someone writes a piece of code using a given statement, realises they need access to the internals in order to test them in isolation, and refactors the code accordingly? Absolutely, just as people take "if __name__ == '__main__'" blocks and move them into main functions to improve testability, break up large functions into smaller ones, add helper methods to classes, or take repeated list comprehensions and make a named function out of them. The problem with the status quo is that there are plenty of people (especially mathematicians and scientists) that *do* think declaratively, just as there are people that think in terms of functional programming rather than more typical imperative styles. Python includes constructs that are technically redundant with each other in order to allow people to express concepts in a way that makes sense to *them* (e.g. keeping functools.reduce around). Unless, that is, you think declaratively. Then Python isn't currently the language for you - go use Ruby or some other language with a construct that lets you supply additional details after the operation that needs them. To attack the question from a different perspective, how do I know that any loops in my code work? After all, my test code can't see my loops - it can only see the data structures they create and the functions that contain them. So the answer is because I tested data structures and I tested those functions. How could I test that a weakref callback did the right thing without direct access to the callback function? The same way I'd have to test it *anyway*, even if I *did* have direct access: by allocating an object, dropping all references to it, forcing a GC cycle. If we're not in the middle of deleting the object, testing the callback actually tells me nothing useful, since the application state is completely wrong. All that said... is your objection *only* to the "statement local" part of the proposal? That part is actually less interesting to me than the out of order execution part. However, our past experience with list comprehensions suggests that exposing the names bound inside the suite in the containing scope would be confusing in its own way. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From mwm at mired.org Mon Oct 17 01:11:54 2011 From: mwm at mired.org (Mike Meyer) Date: Sun, 16 Oct 2011 16:11:54 -0700 Subject: [Python-ideas] Statement local namespaces - a different perspective Message-ID: <20111016161154.15b437fb@bhuda.mired.org> Last night, I realized something interesting about statement local namespaces: Pretty much all the arguments, both for and against, apply to allowing suites in compound statements. I take it as given that any suite in a statement can be refactored into a function definition and an appropriate invocation of that function, even though it may require returning a tuple into a tuple assignment. This is one of the key features of programming languages in general. So we could, in theory, restrict all compound statements suites to a single statement. If you needed more than one statement in them, you'd have to wrap the suite in a named function and invoke it appropriately. The benefits from this restriction would be the same as they are for disallowing statement-local namespaces: it makes the suite available for independent testing and reuse, forces the programmer to come up with a (hopefully) descriptive name, and forces parts to be defined before they are used. The downsides are the same problems that the statement-local namespace proposals are addressing: having to come up with a good name, polluting the namespace, and giving the parts more prominent placement than the whole. Python doesn't do this for compound statements because it's impractical. Not every arbitrary collection of statements that needs to be grouped in a suite has a descriptive name or needs to be reused. Putting them in a named function just means you should test the function as well as the statement. In such cases, it's better to just embed the suite into the statement, and Python allows that. I believe the same thing applies to values in an expression. Not every group of sub-expressions has a good name or needs to be reused. Putting them in a named function just means you should test the function as well as the statement. In such cases, the code would be better if you could just embed the suite into the statement. I think it's time Python allowed that. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From steve at pearwood.info Mon Oct 17 02:30:55 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 17 Oct 2011 11:30:55 +1100 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <4E9751AF.4070705@canterbury.ac.nz> <1318554600.460.83.camel@Gutsy> <4E9AB9A4.9090905@pearwood.info> Message-ID: <4E9B773F.3010501@pearwood.info> Nick Coghlan wrote: > On Sun, Oct 16, 2011 at 9:01 PM, Steven D'Aprano wrote: >> If it's not tested, how do you know it does what you need? > > Because my script does what I want. Not every Python script is a > production application. We lose sight of that point at our peril. > > Systems administrators, lawyers, financial analysts, mathematicians, > scientists... they're not writing continuously integrated production > code, they're not writing reusable modules, they're writing scripts to > *get things done*. "And it doesn't matter whether it's done correctly, so long as SOMETHING gets done!!!" I'm not convinced that requiring coders to write: given a, b c do f(a, b, c) instead of do f(a, b, c) given a, b, c gets in the way of getting things done. I argue that the second form encourages bad practices and we shouldn't add more features to Python that encourage such bad practices. I certainly don't think that people should be forced to test their code if they don't want to, and even if I did, it wouldn't be practical. [...] > All that said... is your objection *only* to the "statement local" > part of the proposal? That part is actually less interesting to me > than the out of order execution part. However, our past experience > with list comprehensions suggests that exposing the names bound inside > the suite in the containing scope would be confusing in its own way. No. Introducing a new scope is fine. Namespaces are cool, and we should have more of them. It's the out of order part that I dislike. Consider this: a = 1 b = 2 c = 3 result = some_function(a, b, c) given: d = "this isn't actually used" a = c b = 42 I see the call to some_function, I *think* I know what arguments it takes, because a b c are already defined, right? But then I see a "given" statement and the start of a new scope, and I have that moment of existential dread where I'm no longer sure of *anything*, any of a b c AND some_function could be replaced, and I won't know which until after reading the entire given block. At which point I can mentally back-track and finally understand the "result =" line. If a, b aren't previously defined, then I have a mental page fault earlier: "what the hell are a and b, oh wait, here comes a given statement, *perhaps* they're defined inside it...". Either way, out of order execution hurts readability. There's a reason that even mathematicians usually define terms before using them. But for what it's worth, if we end up with this feature, I agree that it should introduce a new scope, and the "given" syntax is the nicest I've yet seen. -- Steven From jimjjewett at gmail.com Mon Oct 17 02:50:58 2011 From: jimjjewett at gmail.com (Jim Jewett) Date: Sun, 16 Oct 2011 20:50:58 -0400 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: <4E9B773F.3010501@pearwood.info> References: <4E9751AF.4070705@canterbury.ac.nz> <1318554600.460.83.camel@Gutsy> <4E9AB9A4.9090905@pearwood.info> <4E9B773F.3010501@pearwood.info> Message-ID: On Sun, Oct 16, 2011 at 8:30 PM, Steven D'Aprano wrote: > Nick Coghlan wrote: >> On Sun, Oct 16, 2011 at 9:01 PM, Steven D'Aprano >> wrote: >>> If it's not tested, how do you know it does what you need? >> Because my script does what I want. ... >> they're writing scripts to *get things done*. > "And it doesn't matter whether it's done correctly, so long as > SOMETHING gets done!!!" So they have only system tests, and not unit tests. (Or at least not to the level you would recommend.) But if the system tests pass, that really is enough. (And yes, there could be bugs exposed later by data that didn't show up in the system tests -- but that is still better than most of the other software in that environment.) > I'm not convinced that requiring coders to write: > given a, b c > do f(a, b, c) > instead of > do f(a, b, c) > given a, b, c > gets in the way of getting things done. Nah; people can interrupt their train of thought to scroll up. Doing so doesn't improve their code, but it isn't a deal-breaker. Requiring that order does get in the way of reuse, because burying the lede [ do f(a,b,c) ] makes it a bit harder to find, so there is a stronger temptation to just fork another copy. -jJ From ncoghlan at gmail.com Mon Oct 17 03:23:54 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 17 Oct 2011 11:23:54 +1000 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: <4E9B773F.3010501@pearwood.info> References: <4E9751AF.4070705@canterbury.ac.nz> <1318554600.460.83.camel@Gutsy> <4E9AB9A4.9090905@pearwood.info> <4E9B773F.3010501@pearwood.info> Message-ID: On Mon, Oct 17, 2011 at 10:30 AM, Steven D'Aprano wrote: > Either way, out of order execution hurts readability. There's a reason that > even mathematicians usually define terms before using them. I think out of order execution hurts readability *most* of the time, but that not having it is an annoyance most frequently encountered in the form "Python should have multi-line lambdas". When people hit that mental break, they're thinking of a problem in terms of splitting a single operation down into subcomponents rather than building that statement up via a sequence of steps. I was thinking argparse might be a potential use case for the new syntax, and was reminded that it actually has its own somewhat novel approach to provide a 'declarative' interface: it offers methods that return 'incomplete' objects, which you then fill in after the fact. So, for example, instead of constructing a subparser [1] and then adding the whole thing to the parent parser, you instead write code like the following: # create the top-level parser parser = argparse.ArgumentParser(prog='PROG') parser.add_argument('--foo', action='store_true', help='foo help') # Declare that we're going to be adding subparsers subparsers = parser.add_subparsers(help='sub-command help') # One of those subparsers will be for the "a" command parser_a = subparsers.add_parser('a', help='a help') parser_a.add_argument('bar', type=int, help='bar help') # And the other will be for the "b" command parser_b = subparsers.add_parser('b', help='b help') parser_b.add_argument('--baz', choices='XYZ', help='baz help') [1] http://docs.python.org/library/argparse#argparse.ArgumentParser.add_subparsers Note, however, that in order to get a declarative API, argparse has been forced to couple the subparsers to the parent parser - you can't create subparsers as independent objects and only later attach them to the parent parser. If argparse offered such an API today, it wouldn't be declarative any more, since you'd have to completely define a subparser before you could attach it to the parent parser. PEP 3150 would let the API designer have the best of both worlds, allowing subparsers to be accepted as fully defined objects without giving up the ability to have a declarative API: # create the top-level parser parser = argparse.ArgumentParser(prog='PROG') parser.add_argument('--foo', action='store_true', help='foo help') # Add a subparser for the "a" command parser.add_subparser(parse_a) given: parse_a = argparse.ArgumentParser(prog='a') parse_a.add_argument('bar', type=int, help='bar help') # Add a subparser for the "b" command parser.add_subparser(parse_b) given: parse_b = argparse.ArgumentParser(prog='b') parser_b.add_argument('--baz', choices='XYZ', help='baz help') (FWIW, I'm glossing over some complications relating to the way argparse populates 'prog' attributes on subparsers, but hopefully this gives the general idea of what I mean by a declarative API) > But for what it's worth, if we end up with this feature, I agree that it > should introduce a new scope, and the "given" syntax is the nicest I've yet > seen. I'm significantly happier with the ideas in PEP 3150 now that I've reframed them in my own head as: "You know that magic fairy dust we already use inside the interpreter to support out of order execution for decorators, comprehensions and generator expressions? Let's give that a syntax and let people create their own declarative APIs" Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From wuwei23 at gmail.com Mon Oct 17 05:51:04 2011 From: wuwei23 at gmail.com (alex23) Date: Sun, 16 Oct 2011 20:51:04 -0700 (PDT) Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> <20111013122706.GH6393@pantoffel-wg.de> Message-ID: On Oct 16, 3:04?am, Alexander Belopolsky wrote: > On Sat, Oct 15, 2011 at 12:47 PM, Guido van Rossum wrote: > .. > > > I remember in the past thinking about unifying slice() and range() and > > I couldn't do it. I still can't. I think they should remain separate. > > One of the issues with slices is that they are deliberately made > unhashable to prevent slicing of dictionaries. Could someone point me to an explanation as to why this is the case? Was it purely to avoid confusion? I could easily see myself trying to use slices as keys in a dictionary dispatch. From ncoghlan at gmail.com Mon Oct 17 06:25:25 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 17 Oct 2011 14:25:25 +1000 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> <20111013122706.GH6393@pantoffel-wg.de> Message-ID: On Sun, Oct 16, 2011 at 3:21 AM, Alexander Belopolsky wrote: > On Sat, Oct 15, 2011 at 10:28 AM, Nick Coghlan wrote: >> I have a use case now: switching slice.indices() to return a range object >> instead of a tuple. That heavily favours the 'behave like a sequence' >> approach. > > I like the idea, but why is this a use-case for range.__eq__ or range.__hash__? > > I like the idea not because it will lead to any optimisation. > Slice.indices() does not return a tuple containing all indices in a > slice. ?The result is always a 3-tuple containing normalized start, > stop, and step. ?A range object cannot be more efficient than a > 3-tuple. > > I still like the idea because it would make indices() return what the > name suggests - the sequence of indices selectable by the slice. > Ah, you're quite correct - it only seemed like a use case for equality because I was thinking slice.indices() returned an actual tuple of indices, in which case range() would need to behave like a tuple of integers for compatibility reasons. I forgot that the expression to get the actual indices (rather than the start/stop/step values) is "range(*slice_obj.indices(len(container))". Now that we have full memory efficient range objects, perhaps slice objects should grow a more direct API, along the lines of "slice_obj.make_range(len(container))" Still, "act roughly like a memory efficient tuple of integers" remains a useful design guideline for 3.x range behaviour. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From g.brandl at gmx.net Mon Oct 17 06:57:58 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 17 Oct 2011 06:57:58 +0200 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: <4E9B773F.3010501@pearwood.info> References: <4E9751AF.4070705@canterbury.ac.nz> <1318554600.460.83.camel@Gutsy> <4E9AB9A4.9090905@pearwood.info> <4E9B773F.3010501@pearwood.info> Message-ID: Am 17.10.2011 02:30, schrieb Steven D'Aprano: > Nick Coghlan wrote: >> On Sun, Oct 16, 2011 at 9:01 PM, Steven D'Aprano wrote: >>> If it's not tested, how do you know it does what you need? >> >> Because my script does what I want. Not every Python script is a >> production application. We lose sight of that point at our peril. >> >> Systems administrators, lawyers, financial analysts, mathematicians, >> scientists... they're not writing continuously integrated production >> code, they're not writing reusable modules, they're writing scripts to >> *get things done*. > > "And it doesn't matter whether it's done correctly, so long as SOMETHING > gets done!!!" Sorry, but even with the wink this is a dangerous statement. When I write a 40-line script to read, fit and plot a dataset, I don't add a unit test for that. On the contrary, I want to write that script as conveniently as possible. Python is as popular as it is in science *because* it allows that. People hate writing boilerplate just to do a small job. > I'm not convinced that requiring coders to write: > > given a, b c > do f(a, b, c) > > instead of > > do f(a, b, c) > given a, b, c > > gets in the way of getting things done. I guess this means -0 from my side given: That said: I'm not sure this specific proposal would help. I agree with you and can't see that defining the function before using it is the wrong order. What does appeal to me is the potential for cleaner namespaces; while this doesn't matter so much in the function-locals (there it's mostly the OCD speaking), it would definitely helpful for module and class namespaces (where we routinely see "del" statements used to remove temporary names). I can believe that others see the "given" semantics as the natural order of things. This is probably why Haskell has both "let x = y in z" and "z where x = y" constructs to locally bind names, but there it fits well with the syntax, while in the case of Python, the "given: suite" feels somewhat out of place; we really have to think hard before adding new syntax -- one more thing that the "casual" users Nick mentioned have to grok. cheers, Georg From ncoghlan at gmail.com Mon Oct 17 07:46:40 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 17 Oct 2011 15:46:40 +1000 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <4E9751AF.4070705@canterbury.ac.nz> <1318554600.460.83.camel@Gutsy> <4E9AB9A4.9090905@pearwood.info> <4E9B773F.3010501@pearwood.info> Message-ID: On Mon, Oct 17, 2011 at 2:57 PM, Georg Brandl wrote: > ? I can believe that others see the "given" semantics as the natural order > ? of things. ?This is probably why Haskell has both "let x = y in z" and > ? "z where x = y" constructs to locally bind names, but there it fits well > ? with the syntax, while in the case of Python, the "given: suite" feels > ? somewhat out of place; we really have to think hard before adding new > ? syntax -- one more thing that the "casual" users Nick mentioned have to > ? grok. Yeah, that's a large part of why I now think the given clause needs to be built on the same semantics that we already use internally for implicit out of order evaluation (i.e. decorators, comprehensions and generator expressions), such that it merely exposes the unifying mechanic underlying existing constructs rather than creating a completely new way of doing things. I'm also interested in any examples people have of APIs that engage in "decorator abuse" to get the kind of declarative API I'm talking about. it would be nice to be able to explain classes in terms of the same underlying construct as well, but that gets rather messy due to the fact that classes don't participate in lexical scoping and you have the vagaries of metaclasses to deal with. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From greg.ewing at canterbury.ac.nz Mon Oct 17 09:21:05 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 17 Oct 2011 20:21:05 +1300 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: <4E9B773F.3010501@pearwood.info> References: <4E9751AF.4070705@canterbury.ac.nz> <1318554600.460.83.camel@Gutsy> <4E9AB9A4.9090905@pearwood.info> <4E9B773F.3010501@pearwood.info> Message-ID: <4E9BD761.6050909@canterbury.ac.nz> I don't understand all this angst about testing at all. We don't currently expect to be able to reach inside a function and test parts of it in isolation. PEP 3150 does nothing to change that. -- Greg From greg.ewing at canterbury.ac.nz Mon Oct 17 09:40:24 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 17 Oct 2011 20:40:24 +1300 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111012163144.GB6393@pantoffel-wg.de> <4E95CF7D.6000000@pearwood.info> <20111012193650.GD6393@pantoffel-wg.de> <20111013122706.GH6393@pantoffel-wg.de> Message-ID: <4E9BDBE8.6000605@canterbury.ac.nz> alex23 wrote: > I could easily see myself trying to use slices as keys in a dictionary > dispatch. I think the idea is that if someone writes x = some_dict[3:7] it's more likely that they're trying to extract part of a dict (which doesn't work) rather than look up an item whose key is slice(3, 7). -- Greg From greg.ewing at canterbury.ac.nz Mon Oct 17 09:52:28 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 17 Oct 2011 20:52:28 +1300 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <4E9751AF.4070705@canterbury.ac.nz> <1318554600.460.83.camel@Gutsy> <4E9AB9A4.9090905@pearwood.info> <4E9B773F.3010501@pearwood.info> Message-ID: <4E9BDEBC.2070401@canterbury.ac.nz> Nick Coghlan wrote: > Yeah, that's a large part of why I now think the given clause needs to > be built on the same semantics that we already use internally for > implicit out of order evaluation (i.e. decorators, comprehensions and > generator expressions), such that it merely exposes the unifying > mechanic underlying existing constructs rather than creating a > completely new way of doing things. I'm not sure what you mean by that. If you're talking about the implementation, all three of those use rather different underlying mechanics. What exactly do you see about these that unifies them? -- Greg From raymond.hettinger at gmail.com Mon Oct 17 09:53:39 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Mon, 17 Oct 2011 00:53:39 -0700 Subject: [Python-ideas] Long Live PEP 3150 (was: Re: Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403)) In-Reply-To: References: Message-ID: <67E28FAA-561D-4729-99EF-A32B62F01699@gmail.com> On Oct 16, 2011, at 12:16 AM, Nick Coghlan wrote: > The current draft of PEP 3150 is available on python.org: > http://www.python.org/dev/peps/pep-3150/ FWIW, I think the word "declarative" is being misused. In the context of programming languages, "declarative" is usually contrasted to "imperative" -- describing what you want done versus specifying how to do it. http://en.wikipedia.org/wiki/Declarative_programming I think what you probably meant to describe was something akin to top-down programming http://en.wikipedia.org/wiki/Top%E2%80%93down_and_bottom%E2%80%93up_design#Top.E2.80.93down_approach using forward declarations: http://en.wikipedia.org/wiki/Forward_declaration . Looking at the substance of the proposal, I'm concerned that style gets in the way of fluid code development. Using the PEPs example as a starting point: sorted_data = sorted(data, key=sort_key) given: def sort_key(item): return item.attr1, item.attr2 What if I then wanted to use itertools.groupby with the same key function? I would first have to undo the given-clause. AFAICT, anything in the given statement block becomes hard to re-use or to apply to more than one statement. My guess is that code written using "given" would frequently have to be undone to allow code re-use. Also, it looks like there is a typo in the attrgetter example (the "v." is wrong). It should read: sorted_list = sorted(original, key=attrgetter('attr1', 'attr2') When used with real field names, that is perfectly readable: sorted(employees, key=attrgetter('lastname', 'firstname') That isn't much harder on the eyes than: SELECT * FROM employees ORDER BY lastname, firstname; Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Mon Oct 17 13:56:36 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 17 Oct 2011 12:56:36 +0100 Subject: [Python-ideas] Long Live PEP 3150 (was: Re: Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403)) In-Reply-To: References: Message-ID: On 16 October 2011 08:16, Nick Coghlan wrote: > We may also decide to eliminate the "new scope" implications for > 'given' statements entirely and focus solely on the "out of order > execution" aspect. That would not only simplify the implementation, > but also make for a cleaner extension to the compound statement > headers. FWIW, I did a quick review of some of my own code where I expected the given clause to be helpful (relatively mathematical code computing continued fractions - lots of complex expressions with subexpressions that can be factored out using throwaway names, and very few of the subexpressions are meaningful enough by themselves to make coming up with helpful names really viable). I was surprised to find that the in-line code, using a lot of x's and y's, was readable enough that I could find little use for the given clause. If the "new scope" semantics is included, I might have got some small benefits from being able to reuse x and y all over the place safe in the assurance that a typo wouldn't cause me to silently pick up the wrong value. But even that is stretching for benefits. I still like the idea in principle, but I'm no longer sure it's as useful in practice as I'd expected. A really good use case would help a lot (Nick's argparse example is a start, but I'm nervous about the idea of writing an API that relies on out of order evaluation to be readable). Paul. From ncoghlan at gmail.com Mon Oct 17 14:05:36 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 17 Oct 2011 22:05:36 +1000 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: <4E9BDEBC.2070401@canterbury.ac.nz> References: <4E9751AF.4070705@canterbury.ac.nz> <1318554600.460.83.camel@Gutsy> <4E9AB9A4.9090905@pearwood.info> <4E9B773F.3010501@pearwood.info> <4E9BDEBC.2070401@canterbury.ac.nz> Message-ID: On Mon, Oct 17, 2011 at 5:52 PM, Greg Ewing wrote: > Nick Coghlan wrote: >> >> Yeah, that's a large part of why I now think the given clause needs to >> be built on the same semantics that we already use internally for >> implicit out of order evaluation (i.e. decorators, comprehensions and >> generator expressions), such that it merely exposes the unifying >> mechanic underlying existing constructs rather than creating a >> completely new way of doing things. > > I'm not sure what you mean by that. If you're talking about > the implementation, all three of those use rather different > underlying mechanics. What exactly do you see about these > that unifies them? Actually, comprehensions and generator expressions are almost identical in 3.x (they only differ in the details of the inner loop in the anonymous function). For comprehensions, the parallel with the proposed given statement would be almost exact: seq = [x*y for x in range(10) for y in range(5)] would map to: seq = _list_comp given _outermost_iter = range(10): _list_comp = [] for x in _outermost_iter: for y in range(5): _list_comp.append(x*y) And similarly for set and dict comprehensions: # unique = {x*y for x in range(10) for y in range(5)} unique = _set_comp given _outermost_iter = range(10): _set_comp = set() for x in _outermost_iter: for y in range(5): _set_comp.add(x*y) # map = {(x, y):x*y for x in range(10) for y in range(5)} map = _dict_comp given _outermost_iter = range(10): _anon = {} for x in _outermost_iter: for y in range(5): _anon[x,y] = x*y Note that this lays bare some of the quirks of comprehension scoping - at class scope, the outermost iterator expression can sometimes see names that the inner iterator expressions miss. For generator expressions, the parallel isn't quite as strong, since the compiler is able to avoid the redundant anonymous function involved in the given clause and just emit an anonymous generator directly. However, the general principle still holds: # gen_iter = (x*y for x in range(10) for y in range(5)) gen_iter = _genexp() given _outermost_iter = range(10): def _genexp(): for x in _outermost_iter: for y in range(5): yield x*y For decorated functions, the parallel is actually almost as weak as it is for classes, since so many of the expressions involved (decorator expressions, default arguments, annotations) get evaluated in order in the current scope and even a given statement can't reproduce the actual function statement's behaviour of not being bound at *all* in the current scope while decorators are being applied, even though the function already knows what it is going to be called: >>> def call(f): ... print(f.__name__) ... return f() ... >>> @call ... def func(): ... return func.__name__ ... func Traceback (most recent call last): File "", line 1, in File "", line 3, in call File "", line 3, in func NameError: global name 'func' is not defined So it's really only the machinery underlying comprehensions that is being exposed by the PEP rather than anything more far reaching. Exposing the generator expression machinery directly would require the ability to turn the given clause into a generator (via a top level yield expression) and then a means to reference that from the header line, which gets us back into cryptic and unintuitive PEP 403 territory. Better to settle for the named alternative. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Mon Oct 17 14:30:38 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 17 Oct 2011 14:30:38 +0200 Subject: [Python-ideas] Long Live PEP 3150 (was: Re: Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403)) References: Message-ID: <20111017143038.7347307e@pitrou.net> On Sun, 16 Oct 2011 17:16:40 +1000 Nick Coghlan wrote: > # Embedded assignments in if statements > if match is not None given match=re.search(pattern, text): > # process match > else: > # handle case where regex is not None > > # Embedded assignments in while loops > while match is not None given match=re.search(pattern, text): > # process match > else: > # handle case where regex is not None Urk. What is the use case for these? Save one line of code but actually type *more* characters? "given" doesn't look like a very pretty notation to me, honestly. Apparently you're looking for a way to allow assignments in other statements but without admitting it (because we don't want to lose face?...). Also: sorted_data = sorted(data, key=sort_key) given: def sort_key(item): return item.attr1, item.attr2 isn't light-weight compared to anonymous functions that other languages have. The whole point of anonymous functions in e.g. Javascript is that embedding them in another statement or expression makes it a very natural way of writing code. The "given" syntax doesn't achieve this IMO; it forces you to write two additional keywords ("given" and "def") and also write twice a function name that's totally useless - since you can't reuse it anyway, as pointed out by Raymond. I'm -1 on that syntax. Regards Antoine. From ncoghlan at gmail.com Mon Oct 17 14:35:39 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 17 Oct 2011 22:35:39 +1000 Subject: [Python-ideas] Long Live PEP 3150 (was: Re: Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403)) In-Reply-To: <67E28FAA-561D-4729-99EF-A32B62F01699@gmail.com> References: <67E28FAA-561D-4729-99EF-A32B62F01699@gmail.com> Message-ID: On Mon, Oct 17, 2011 at 5:53 PM, Raymond Hettinger wrote: > > On Oct 16, 2011, at 12:16 AM, Nick Coghlan wrote: > > The current draft of PEP 3150 is available on?python.org: > http://www.python.org/dev/peps/pep-3150/ > > FWIW, I think the word "declarative" is being misused. > In the context of programming languages, "declarative" > is usually contrasted to "imperative" -- describing > what you want done versus specifying how to do it. > http://en.wikipedia.org/wiki/Declarative_programming > I think what you probably meant to describe was something > akin to top-down programming > http://en.wikipedia.org/wiki/Top%E2%80%93down_and_bottom%E2%80%93up_design#Top.E2.80.93down_approach > using forward declarations: > ?http://en.wikipedia.org/wiki/Forward_declaration?. Agreed, top-down vs bottom-up would be better terminology than declarative vs imperative (although I *do* believe having the given statement available would make it easier to write declarative APIs in Python without abusing decorators or needing to delve into metaclass programming). > Looking at the substance of the proposal, I'm concerned that style gets in > the way of fluid code development. > Using the PEPs example as a starting point: > > sorted_data = sorted(data, key=sort_key) given: > def sort_key(item): > return item.attr1, item.attr2 > > What if I then wanted to use itertools.groupby with the same key function? > I would first have to undo the given-clause. > AFAICT, anything in the given statement block becomes hard to re-use > or to apply to more than one statement. ?My guess is that code written > using "given" would frequently have to be undone to allow code re-use. Agreed, but is that really so different from lifting *any* inline code out into a separate function for later reuse? Consider that the standard recommendation for "multi-line lambdas" is to define a function immediately before the statement where you want to use it. If that statement is in module level code, fine, that function is now available for reuse elsewhere (perhaps more places than you really want, since it is now potentially part of the module namespace). But in a class scope it's almost certainly more exposed than you want (unless you delete it after use) and in a function scope it isn't exposed any more than it would be in a given statement, and will need to be moved before you can reuse it in a different function. There's a reason the PEP's suggested additions to PEP 8 mention the idea that given statements are a stepping stone towards splitting an operation out into its own function - it's a discrete calculation or other operation, so it could be separated out, but you don't actually want to reuse it anywhere else (or can't come up with a good name), so it makes more sense to leave the code in its current location. It's definitely a concern (and is one of the reasons why I think this PEP needs to bake for a *long* time before it can be considered remotely justifiable), but I still think allowing people to express top down thought processes clearly, and have the code be able to execute in that form is potentially valuable, even if other software engineering concerns soon result in the code being restructured when it comes to application programs. > Also, it looks like there is a typo in the attrgetter example (the "v." is > wrong). > It should read: > ? ?sorted_list = sorted(original, key=attrgetter('attr1', 'attr2') Oops, you're right - I'll fix that in the next update. > When used with real field names, that is perfectly readable: > ? ?sorted(employees, key=attrgetter('lastname', 'firstname') > That isn't much harder on the eyes than: > ? ?SELECT * FROM employees ORDER BY lastname, firstname; Again, agreed, but I don't think SQL rises to the bar of executable pseudocode either ;) I've actually been trying to think of an example where you'd want to be normalising data on the fly, catching exceptions as you go along, so that lambda expressions and the operator module can't help, with bottom-up programming being the only alternative to the PEP's direct reflection of a top-down thought process. No luck on that front so far, though. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From p.f.moore at gmail.com Mon Oct 17 15:00:06 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 17 Oct 2011 14:00:06 +0100 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <4E9751AF.4070705@canterbury.ac.nz> <1318554600.460.83.camel@Gutsy> <4E9AB9A4.9090905@pearwood.info> <4E9B773F.3010501@pearwood.info> <4E9BDEBC.2070401@canterbury.ac.nz> Message-ID: On 17 October 2011 13:05, Nick Coghlan wrote: > For comprehensions, the parallel with the proposed given statement > would be almost exact: > > ? ?seq = [x*y for x in range(10) for y in range(5)] > > would map to: > > ? ?seq = _list_comp given _outermost_iter = range(10): > ? ? ? ?_list_comp = [] > ? ? ? ?for x in _outermost_iter: > ? ? ? ? ? ?for y in range(5): > ? ? ? ? ? ? ? ?_list_comp.append(x*y) Whoa... NAME1 = EXPR1 given NAME2 = EXPR2: ASSIGNMENT FOR LOOP ???? Surely that doesn't match the behaviour for "given" that you were suggesting? Even if I assume that having _outermost_iter = range(10) before the colon was a typo, having a for loop in the given suite scares me. I can see what it would mean in terms of pure code-rewriting semantics, but it doesn't match at all my intuition of what the term "given" would mean. I'd expect the given suite to only contain name-definition statements (assignments, function and class definitions). Anything else should be at least bad practice, if not out and out illegal... Paul. From ncoghlan at gmail.com Mon Oct 17 15:18:53 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 17 Oct 2011 23:18:53 +1000 Subject: [Python-ideas] Long Live PEP 3150 (was: Re: Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403)) In-Reply-To: <20111017143038.7347307e@pitrou.net> References: <20111017143038.7347307e@pitrou.net> Message-ID: On Mon, Oct 17, 2011 at 10:30 PM, Antoine Pitrou wrote: > On Sun, 16 Oct 2011 17:16:40 +1000 > Nick Coghlan wrote: >> ? ? # Embedded assignments in if statements >> ? ? if match is not None given match=re.search(pattern, text): >> ? ? ? ? # process match >> ? ? else: >> ? ? ? ? # handle case where regex is not None >> >> ? ? # Embedded assignments in while loops >> ? ? while match is not None given match=re.search(pattern, text): >> ? ? ? ? # process match >> ? ? else: >> ? ? ? ? # handle case where regex is not None > > Urk. What is the use case for these? Save one line of code but actually > type *more* characters? > > "given" doesn't look like a very pretty notation to me, honestly. > Apparently you're looking for a way to allow assignments in other > statements but without admitting it (because we don't want to lose > face?...). They're throwaway ideas - there's a reason the PEP itself is restricted to simple statements. > Also: > > sorted_data = sorted(data, key=sort_key) given: > ? ?def sort_key(item): > ? ? ? ?return item.attr1, item.attr2 > > isn't light-weight compared to anonymous functions that other languages > have. > The whole point of anonymous functions in e.g. Javascript is that > embedding them in another statement or expression makes it a very > natural way of writing code. The "given" syntax doesn't achieve this > IMO; it forces you to write two additional keywords ("given" and "def") > and also write twice a function name that's totally useless - since you > can't reuse it anyway, as pointed out by Raymond. > > I'm -1 on that syntax. If we accept the premise that full featured anonymous functions have their place in life (and, over the years, I've been persuaded that they do), then Python's sharp statement/expression dichotomy and significant leading whitespace at the statement level severely limit our options: 1. Adopt a syntax that still names the functions, but uses those names to allow forward references to functions that are defined later in a private indented suite (this is the route I've taken in PEP 3150). This is based on the premise that the real benefit of anonymous functions lies in their support for top down thought processes, and that linking them up to a later definition with a throwaway name will be less jarring than having to go back to the previous line in order to fill them in while writing code, and then skip over them while reading code to get to the line that matters, before scanning back up to loop at the operation details. 2. Adopt a symbolic syntax to allow a forward reference to a trailing suite that is an implicit function definition somewhat along the lines of Ruby block, only with Python-style namespace semantics (this approach was soundly demolished in the overwhelmingly negative reactions to PEP 403 - the assorted reactions to PEP 3150 have been positively welcoming by comparison) 3. Continue the trend of giving every Python statement an equivalent expression form so that lambda expressions become just as powerful as named functions (we've gone a long way down that road already, but exception handling and name binding are sticking points. One of my goals with PEP 3150 is actually to *halt* that trend by providing a "private suite" that allows the existing significant whitespace syntax to be embedded inside a variety of statements) 4. Add a non-whitespace delimited syntax that allows suites to be embedded inside expressions at arbitrary locations (I believe "from __future__ import braces" answers that one) 5. Just accept that there are some styles of thought that cannot be expressed clearly in Python. Developers that don't like that can either suck it up and learn to live with writing in a style that Python supports (which is, admittedly, not a problem most of the time) or else find another language that fits their brains better. That's a perfectly reasonable choice for us to make, but we should do it with a clear understanding of the patterns of thought that we are officially declaring to be unsupported. That last option, of course, is the status quo that currently wins by default. If anyone can think of additional alternatives outside those 5 options, I'd love to see a PEP :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon Oct 17 15:27:49 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 17 Oct 2011 23:27:49 +1000 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <4E9751AF.4070705@canterbury.ac.nz> <1318554600.460.83.camel@Gutsy> <4E9AB9A4.9090905@pearwood.info> <4E9B773F.3010501@pearwood.info> <4E9BDEBC.2070401@canterbury.ac.nz> Message-ID: On Mon, Oct 17, 2011 at 11:00 PM, Paul Moore wrote: > I'd expect the given suite to only contain name-definition statements > (assignments, function and class definitions). Anything else should be > at least bad practice, if not out and out illegal... It would be perfectly legal, just inadvisable most of the time (as the suggested PEP 8 addition says, if your given clause gets unwieldy, it's probably a bad idea). However, those expansions are almost exactly what 3.x comprehensions look like from the interpreter's point of view. And the assignment between the given statement and the colon wasn't a typo - that was the explicit early binding of an expression evaluated in the outer scope. It's the way comprehensions behave, and I suspect it may be a good compromise for given statements as well (to make their scoping rules more consistent with the rest of the language, while still having them be vaguely usable at class scope) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Mon Oct 17 15:32:31 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 17 Oct 2011 15:32:31 +0200 Subject: [Python-ideas] Long Live PEP 3150 (was: Re: Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403)) References: <20111017143038.7347307e@pitrou.net> Message-ID: <20111017153231.3ad8e1e5@pitrou.net> On Mon, 17 Oct 2011 23:18:53 +1000 Nick Coghlan wrote: > > If we accept the premise that full featured anonymous functions have > their place in life (and, over the years, I've been persuaded that > they do), They are nice to have. But so are switch statements, multi-line comments, syntactic support for concurrency and other constructs that Python doesn't have. > 2. Adopt a symbolic syntax to allow a forward reference to a trailing > suite that is an implicit function definition somewhat along the lines > of Ruby block, only with Python-style namespace semantics (this > approach was soundly demolished in the overwhelmingly negative > reactions to PEP 403 - the assorted reactions to PEP 3150 have been > positively welcoming by comparison) The "postdef" keyword is arguably inelegant. You haven't answered to my lambda-based proposal on this PEP. What do you think of it? > 4. Add a non-whitespace delimited syntax that allows suites to be > embedded inside expressions at arbitrary locations (I believe "from > __future__ import braces" answers that one) :-} Regards Antoine. From arnodel at gmail.com Mon Oct 17 16:57:58 2011 From: arnodel at gmail.com (Arnaud Delobelle) Date: Mon, 17 Oct 2011 15:57:58 +0100 Subject: [Python-ideas] Long Live PEP 3150 (was: Re: Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403)) In-Reply-To: References: Message-ID: On 17 October 2011 12:56, Paul Moore wrote: > I still like the idea in principle, but I'm no longer sure it's as > useful in practice as I'd expected. A really good use case would help > a lot (Nick's argparse example is a start, but I'm nervous about the > idea of writing an API that relies on out of order evaluation to be > readable). Note that, if the "given" keyword creates a new scope, it provides a way to give static variables to functions - a feature for which various ad-hoc syntaxes were recently discussed on this list: counter = counter given: acc = 0 def counter(i): nonlocal acc acc += i return acc -- Arnaud From sven at marnach.net Mon Oct 17 17:02:43 2011 From: sven at marnach.net (Sven Marnach) Date: Mon, 17 Oct 2011 16:02:43 +0100 Subject: [Python-ideas] Implement comparison operators for range objects In-Reply-To: References: <20111013122706.GH6393@pantoffel-wg.de> Message-ID: <20111017150243.GA3948@pantoffel-wg.de> I've created two tracker issues for further discussion on this topic: http://bugs.python.org/issue13200 http://bugs.python.org/issue13201 Cheers, Sven From ericsnowcurrently at gmail.com Mon Oct 17 18:39:24 2011 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 17 Oct 2011 10:39:24 -0600 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <4E9751AF.4070705@canterbury.ac.nz> <1318554600.460.83.camel@Gutsy> <4E9AB9A4.9090905@pearwood.info> <4E9B773F.3010501@pearwood.info> <4E9BDEBC.2070401@canterbury.ac.nz> Message-ID: On Mon, Oct 17, 2011 at 7:00 AM, Paul Moore wrote: > On 17 October 2011 13:05, Nick Coghlan wrote: >> For comprehensions, the parallel with the proposed given statement >> would be almost exact: >> >> ? ?seq = [x*y for x in range(10) for y in range(5)] >> >> would map to: >> >> ? ?seq = _list_comp given _outermost_iter = range(10): >> ? ? ? ?_list_comp = [] >> ? ? ? ?for x in _outermost_iter: >> ? ? ? ? ? ?for y in range(5): >> ? ? ? ? ? ? ? ?_list_comp.append(x*y) > > Whoa... > > NAME1 = EXPR1 given NAME2 = EXPR2: > ? ?ASSIGNMENT > ? ?FOR LOOP > > ???? > > Surely that doesn't match the behaviour for "given" that you were > suggesting? Even if I assume that having _outermost_iter = range(10) > before the colon was a typo, having a for loop in the given suite > scares me. I can see what it would mean in terms of pure > code-rewriting semantics, but it doesn't match at all my intuition of > what the term "given" would mean. > > I'd expect the given suite to only contain name-definition statements > (assignments, function and class definitions). Anything else should be > at least bad practice, if not out and out illegal... It's the same as a class definition--where you can, but likely don't, have for loops and the like. In the end it's just the resulting namespace that matters. -eric > > Paul. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From ron3200 at gmail.com Mon Oct 17 22:06:21 2011 From: ron3200 at gmail.com (Ron Adam) Date: Mon, 17 Oct 2011 15:06:21 -0500 Subject: [Python-ideas] Local only statement mapping with "given/get" blocks. In-Reply-To: References: <4E9751AF.4070705@canterbury.ac.nz> <1318554600.460.83.camel@Gutsy> <4E9AB9A4.9090905@pearwood.info> <4E9B773F.3010501@pearwood.info> <4E9BDEBC.2070401@canterbury.ac.nz> Message-ID: <1318881981.23011.158.camel@Gutsy> On Mon, 2011-10-17 at 22:05 +1000, Nick Coghlan wrote: > On Mon, Oct 17, 2011 at 5:52 PM, Greg Ewing wrote: > > Nick Coghlan wrote: > >> > >> Yeah, that's a large part of why I now think the given clause needs to > >> be built on the same semantics that we already use internally for > >> implicit out of order evaluation (i.e. decorators, comprehensions and > >> generator expressions), such that it merely exposes the unifying > >> mechanic underlying existing constructs rather than creating a > >> completely new way of doing things. > > > > I'm not sure what you mean by that. If you're talking about > > the implementation, all three of those use rather different > > underlying mechanics. What exactly do you see about these > > that unifies them? > > Actually, comprehensions and generator expressions are almost > identical in 3.x (they only differ in the details of the inner loop in > the anonymous function). > > For comprehensions, the parallel with the proposed given statement > would be almost exact: > > seq = [x*y for x in range(10) for y in range(5)] > > would map to: > > seq = _list_comp given _outermost_iter = range(10): > _list_comp = [] > for x in _outermost_iter: > for y in range(5): > _list_comp.append(x*y) Ok, here's a way to look at this that I think you will find interesting. It looks to me that the 'given' keyword is setting up a local name_space in the way it's used. So rather than taking an expression, maybe it should take a mapping. (which could be from an expression) mapping = dict(iter1=range(10), iter2=range(5)) given mapping: # mapping as local scope list_comp=[] for x in iter1: for y in iter2: list_comp.append(x*y) seq = mapping['list_comp'] (We could stop here.) This doesn't do anything out of order. It shows that statement local name space, and the out of order assignment are two completely different things. But let's continue... Suppose we use a two suite pattern to make getting values out easier. mapping = dict(iter1=range(10), iter2=range(5)) given mapping: list_comp=[] for x in iter1: for y in iter2: list_comp.append(x*y) get: list_comp as seq # seq = mapping['list_comp'] That saves us from having to refer to 'mapping' multiple times, especially if we need to get a lot of values from it. So now we can change the above to ... given dict(iter1=range(10), iter2=range(5)): list_comp=[] for x in iter1: for y in iter2: list_comp.append(x*y) get: list_comp as seq And then finally put the 'get' block first. get: list_comp as seq given dict(iter1=range(10), iter2=range(5)): list_comp=[] for x in iter1: for y in iter2: list_comp.append(x*y) Which is very close to the example you gave above, but more readable because it puts the keywords in the front. That also makes it more like a statement than an expression. Note, that if you use a named mapping with given, you can inspect it after the given block is done, and/or reuse it multiple times. I think that will be very useful for unittests. This creates a nice way to express some types of blocks that have local only names in pure python rather than just saying it's magic dust sprinkled here and there to make it work like that. (That doesn't mean we should actually change those, but the semantics could match.) > And similarly for set and dict comprehensions: > > # unique = {x*y for x in range(10) for y in range(5)} > unique = _set_comp given _outermost_iter = range(10): > _set_comp = set() > for x in _outermost_iter: > for y in range(5): > _set_comp.add(x*y) get: set_comp as unique given dict(iter1=range(10), iter2=range(5)): set_comp = set() for x in iter1: for y in iter2: set_comp.add(x, y) > # map = {(x, y):x*y for x in range(10) for y in range(5)} > map = _dict_comp given _outermost_iter = range(10): > _anon = {} > for x in _outermost_iter: > for y in range(5): > _anon[x,y] = x*y get: dict_comp as map given dict(iter1=range(10), iter2=range(5)): dict_comp = {} for x in iter1: for y in iter2: dict_comp[x] = y I'm not sure if I prefer the "get" block first or last. given dict(iter1=range(10), iter2=range(5)): dict_comp = {} for x in iter1: for y in iter2: dict_comp[x] = y get: dict_comp as map But the order given/get order is a detail you can put to a final vote at some later time. > Note that this lays bare some of the quirks of comprehension scoping - > at class scope, the outermost iterator expression can sometimes see > names that the inner iterator expressions miss. > > For generator expressions, the parallel isn't quite as strong, since > the compiler is able to avoid the redundant anonymous function > involved in the given clause and just emit an anonymous generator > directly. However, the general principle still holds: > > # gen_iter = (x*y for x in range(10) for y in range(5)) > gen_iter = _genexp() given _outermost_iter = range(10): > def _genexp(): > for x in _outermost_iter: > for y in range(5): > yield x*y given dict(iter1=range(10), iter2=range(5)): def genexp(): for x in iter1: for y in iter2: yield x*y get: genexp as gen_iter Interestingly, if we transform the given blocks a bit more we get something that is nearly a function. given Signature().bind(mapping): ... function body ... get: ... return values ... ('def' would wrap it in an object, and give it a name.) So it looks like it has potential to unify some underlying mechanisms as well as create a nice local only statement space. What I like about it is that it appears to complement python very well and doesn't feel like it's something tacked on. I think having given take a mapping is what did that for me. Cheers, Ron > For decorated functions, the parallel is actually almost as weak as it > is for classes, since so many of the expressions involved (decorator > expressions, default arguments, annotations) get evaluated in order in > the current scope and even a given statement can't reproduce the > actual function statement's behaviour of not being bound at *all* in > the current scope while decorators are being applied, even though the > function already knows what it is going to be called: It's hard to beat a syntax that is only one character long. ;-) > >>> def call(f): > ... print(f.__name__) > ... return f() > ... > >>> @call > ... def func(): > ... return func.__name__ > ... > func > Traceback (most recent call last): > File "", line 1, in > File "", line 3, in call > File "", line 3, in func > NameError: global name 'func' is not defined > > So it's really only the machinery underlying comprehensions that is > being exposed by the PEP rather than anything more far reaching. > > Exposing the generator expression machinery directly would require the > ability to turn the given clause into a generator (via a top level > yield expression) and then a means to reference that from the header > line, which gets us back into cryptic and unintuitive PEP 403 > territory. Better to settle for the named alternative. > > Cheers, > Nick. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron3200 at gmail.com Mon Oct 17 22:21:04 2011 From: ron3200 at gmail.com (Ron Adam) Date: Mon, 17 Oct 2011 15:21:04 -0500 Subject: [Python-ideas] Local only statement mapping with "given/get" blocks. In-Reply-To: <1318881981.23011.158.camel@Gutsy> References: <4E9751AF.4070705@canterbury.ac.nz> <1318554600.460.83.camel@Gutsy> <4E9AB9A4.9090905@pearwood.info> <4E9B773F.3010501@pearwood.info> <4E9BDEBC.2070401@canterbury.ac.nz> <1318881981.23011.158.camel@Gutsy> Message-ID: <1318882864.23011.162.camel@Gutsy> On Mon, 2011-10-17 at 15:06 -0500, Ron Adam wrote: ... Oops. Had some tab issues... 2 space indents should have been 4 space indents and a few indents were lost. But it's fairly obvious which ones. From g.nius.ck at gmail.com Tue Oct 18 01:51:35 2011 From: g.nius.ck at gmail.com (Christopher King) Date: Mon, 17 Oct 2011 19:51:35 -0400 Subject: [Python-ideas] Tuple Comprehensions In-Reply-To: <20111009210632.GB9230@pantoffel-wg.de> References: <20111009210632.GB9230@pantoffel-wg.de> Message-ID: > > This syntax is already taken for generator expressions. > How? -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Tue Oct 18 01:56:03 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 18 Oct 2011 10:56:03 +1100 Subject: [Python-ideas] Tuple Comprehensions In-Reply-To: References: <20111009210632.GB9230@pantoffel-wg.de> Message-ID: <4E9CC093.8080000@pearwood.info> Christopher King wrote: >> This syntax is already taken for generator expressions. >> > How? Your proposed syntax for tuple comprehensions: (expr for x in iterable) Syntax already used for generator expressions: (expr for x in iterable) -- Steven From g.nius.ck at gmail.com Tue Oct 18 01:57:55 2011 From: g.nius.ck at gmail.com (Christopher King) Date: Mon, 17 Oct 2011 19:57:55 -0400 Subject: [Python-ideas] Tuple Comprehensions In-Reply-To: <4E925D95.5040402@gmx.net> References: <4E925D95.5040402@gmx.net> Message-ID: > > >>> import operator > >>> xs=['banana', 'loganberry', 'passion fruit'] > >>> reduce(operator.concat,(x.**strip() for x in xs)) To long for something so simple -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Tue Oct 18 02:05:54 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 17 Oct 2011 20:05:54 -0400 Subject: [Python-ideas] Tuple Comprehensions In-Reply-To: References: <4E925D95.5040402@gmx.net> Message-ID: On Mon, Oct 17, 2011 at 7:57 PM, Christopher King wrote: .. >> >>> reduce(operator.concat,(x.strip() for x in xs)) > > To long for something so simple .. and how would tuple comprehensions help in this case? A shorter expression would be >>> ''.join(x.strip() for x in xs) For your data, .strip() seems redundant, so >>> ''.join(xs) would achieve the same result. Finally, a more realistic application would want to separate the data somehow: >>> ','.join(xs) From ben+python at benfinney.id.au Tue Oct 18 06:57:03 2011 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 18 Oct 2011 15:57:03 +1100 Subject: [Python-ideas] Tuple Comprehensions References: <20111009210632.GB9230@pantoffel-wg.de> <4E9CC093.8080000@pearwood.info> Message-ID: <871uuaucbk.fsf@benfinney.id.au> Steven D'Aprano writes: > Your proposed syntax for tuple comprehensions: > > (expr for x in iterable) > > Syntax already used for generator expressions: > > (expr for x in iterable) More precisely, the parens are not part of the syntax for generator expressions. But the above syntax is a valid, paren-enclosed, generator expression; so the proposed syntax is indistinguishable from already-valid syntax that means something else in existing code. -- \ ?We must respect the other fellow's religion, but only in the | `\ sense and to the extent that we respect his theory that his | _o__) wife is beautiful and his children smart.? ?Henry L. Mencken | Ben Finney From ncoghlan at gmail.com Tue Oct 18 07:27:12 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 18 Oct 2011 15:27:12 +1000 Subject: [Python-ideas] Tuple Comprehensions In-Reply-To: <871uuaucbk.fsf@benfinney.id.au> References: <20111009210632.GB9230@pantoffel-wg.de> <4E9CC093.8080000@pearwood.info> <871uuaucbk.fsf@benfinney.id.au> Message-ID: On Tue, Oct 18, 2011 at 2:57 PM, Ben Finney wrote: > More precisely, the parens are not part of the syntax for generator > expressions. Yes they are. There's just an exception in the grammar to make it so that the *existing* parens that indicate a function call also count when a generator expression is the sole positional argument. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ben+python at benfinney.id.au Tue Oct 18 07:47:18 2011 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 18 Oct 2011 16:47:18 +1100 Subject: [Python-ideas] Tuple Comprehensions References: <20111009210632.GB9230@pantoffel-wg.de> <4E9CC093.8080000@pearwood.info> <871uuaucbk.fsf@benfinney.id.au> Message-ID: <87obxesvfd.fsf@benfinney.id.au> Nick Coghlan writes: > On Tue, Oct 18, 2011 at 2:57 PM, Ben Finney wrote: > > More precisely, the parens are not part of the syntax for generator > > expressions. > > Yes they are. I had the impression that they could appear as a bare expression without parens. But the Python interpreter doesn't support that impression, so I guess I was wrong. Thanks for pointing it out. -- \ ?We reserve the right to serve refuse to anyone.? ?restaurant, | `\ Japan | _o__) | Ben Finney From ncoghlan at gmail.com Tue Oct 18 08:49:15 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 18 Oct 2011 16:49:15 +1000 Subject: [Python-ideas] Tuple Comprehensions In-Reply-To: <87obxesvfd.fsf@benfinney.id.au> References: <20111009210632.GB9230@pantoffel-wg.de> <4E9CC093.8080000@pearwood.info> <871uuaucbk.fsf@benfinney.id.au> <87obxesvfd.fsf@benfinney.id.au> Message-ID: On Tue, Oct 18, 2011 at 3:47 PM, Ben Finney wrote: > Nick Coghlan writes: > >> On Tue, Oct 18, 2011 at 2:57 PM, Ben Finney wrote: >> > More precisely, the parens are not part of the syntax for generator >> > expressions. >> >> Yes they are. > > I had the impression that they could appear as a bare expression without > parens. But the Python interpreter doesn't support that impression, so I > guess I was wrong. Thanks for pointing it out. You may have been thinking of yield expressions, where the parentheses aren't technically part of the syntax, but are still mandatory in a lot of places to avoid visual ambiguity. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From scott+python-ideas at scottdial.com Tue Oct 18 14:53:22 2011 From: scott+python-ideas at scottdial.com (Scott Dial) Date: Tue, 18 Oct 2011 08:53:22 -0400 Subject: [Python-ideas] Long Live PEP 3150 (was: Re: Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403)) In-Reply-To: References: <20111017143038.7347307e@pitrou.net> Message-ID: <4E9D76C2.4050107@scottdial.com> On 10/17/2011 9:18 AM, Nick Coghlan wrote: > 1. Adopt a syntax that still names the functions, but uses those names > to allow forward references to functions that are defined later in a > private indented suite (this is the route I've taken in PEP 3150). It may be that I have spent so much time using Twisted, but this seems like the only viable solution given my need to define callbacks simultaneously with an errback. That is: d = getDeferredFromSomewhere() d.addCallbacks(callback, errback) Being only able to name a single anonymous function is significantly less useful (perhaps to the point of useless) to anyone writing Twisted Deferred-style code. (BTW, you cannot split that into two statements to weasel around the "given" limitation). Personally, I never noticed that this "style" was awkward. If I was writing C, then I would have to put those functions into the file-scope and, while there is no ordering restriction there, I always put them above all of the functions I used them in. So when I came to Python, I just kept with that style except that I don't have to elevate them to the module-scope if a closure is useful (often it is!). Usually my callbacks/errbacks are non-trivial enough that I would not want to put them inline with executable statements -- I usually pull them to the top of whatever scope I am defining them and live with the out-of-order-ness. However, I could see how hanging them off the bottom in an indented scope might be more readable. -- Scott Dial scott at scottdial.com From Nikolaus at rath.org Wed Oct 19 04:14:56 2011 From: Nikolaus at rath.org (Nikolaus Rath) Date: Tue, 18 Oct 2011 22:14:56 -0400 Subject: [Python-ideas] Avoiding nested for try..finally: atexit for functions? Message-ID: <87pqhtafrz.fsf@vostro.rath.org> Hello, I often have code of the form: def my_fun(): allocate_res1() try: # do stuff allocate_res2() try: # do stuff allocate_res3() try: # do stuff finally: cleanup_res3() finally: cleanup_res2() finally: cleanup_res1() return With increasing number of managed resources, the indentation becomes really annoying, there is lots of line noise, and I don't like the fact that the cleanup is so far away from the allocation. I would much rather have something like this: def my_fun(): allocate_res1() atreturn.register(cleanup_res1) # do stuff allocate_res2() atreturn.register(cleanup_res2) # do stuff allocate_res3() atreturn.register(cleanup_res3) # do stuff return Has the idea of implementing such "on return" handlers ever come up? Maybe there is some tricky way to do this with function decorators? Best, -Nikolaus -- ?Time flies like an arrow, fruit flies like a Banana.? PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C From pyideas at rebertia.com Wed Oct 19 05:14:09 2011 From: pyideas at rebertia.com (Chris Rebert) Date: Tue, 18 Oct 2011 20:14:09 -0700 Subject: [Python-ideas] Avoiding nested for try..finally: atexit for functions? In-Reply-To: <87pqhtafrz.fsf@vostro.rath.org> References: <87pqhtafrz.fsf@vostro.rath.org> Message-ID: On Tue, Oct 18, 2011 at 7:14 PM, Nikolaus Rath wrote: > Hello, > > I often have code of the form: > > def my_fun(): > ? ?allocate_res1() > ? ?try: > ? ? ? # do stuff > ? ? ? allocate_res2() > ? ? ? try: > ? ? ? ? ? # do stuff > ? ? ? ? ? allocate_res3() > ? ? ? ? ? try: > ? ? ? ? ? ? ? # do stuff > ? ? ? ? ? finally: > ? ? ? ? ? ? ? cleanup_res3() > ? ? ? finally: > ? ? ? ? ? cleanup_res2() > ? ?finally: > ? ? ? ?cleanup_res1() > > ? ?return > > With increasing number of managed resources, the indentation becomes > really annoying, there is lots of line noise, and I don't like the fact > that the cleanup is so far away from the allocation. Use the `with` statement and context managers. They were added for this exact situation. See http://www.python.org/dev/peps/pep-0343/ Resulting code will resemble: def func(): with alloc() as res1, alloc() as res2, alloc() as res3: # do stuff Cheers, Chris From ncoghlan at gmail.com Wed Oct 19 05:58:07 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 19 Oct 2011 13:58:07 +1000 Subject: [Python-ideas] Avoiding nested for try..finally: atexit for functions? In-Reply-To: References: <87pqhtafrz.fsf@vostro.rath.org> Message-ID: On Wed, Oct 19, 2011 at 1:14 PM, Chris Rebert wrote: > On Tue, Oct 18, 2011 at 7:14 PM, Nikolaus Rath wrote: >> Hello, >> >> I often have code of the form: >> >> def my_fun(): >> ? ?allocate_res1() >> ? ?try: >> ? ? ? # do stuff >> ? ? ? allocate_res2() >> ? ? ? try: >> ? ? ? ? ? # do stuff >> ? ? ? ? ? allocate_res3() >> ? ? ? ? ? try: >> ? ? ? ? ? ? ? # do stuff >> ? ? ? ? ? finally: >> ? ? ? ? ? ? ? cleanup_res3() >> ? ? ? finally: >> ? ? ? ? ? cleanup_res2() >> ? ?finally: >> ? ? ? ?cleanup_res1() >> >> ? ?return >> >> With increasing number of managed resources, the indentation becomes >> really annoying, there is lots of line noise, and I don't like the fact >> that the cleanup is so far away from the allocation. > > Use the `with` statement and context managers. They were added for > this exact situation. > See http://www.python.org/dev/peps/pep-0343/ > > Resulting code will resemble: > > def func(): > ? ?with alloc() as res1, alloc() as res2, alloc() as res3: > ? ? ? ?# do stuff Or, to more closely mirror the original example: # Define these wherever the current resources are defined @contextlib.contextmanager def cm1(): res1 = allocate_res1() try: yield res1 finally: cleanup_res1() @contextlib.contextmanager def cm2(): res2 = allocate_res2() try: yield res2 finally: cleanup_res2() @contextlib.contextmanager def cm3(): res3 = allocate_res3() try: yield res2 finally: cleanup_res3() def func(): with cm1() as res1: # do stuff with cm2() as res2: # do stuff with cm3() as res3: # do stuff Any time a with statement's body consists solely of another with statement you can collapse them into one line as Chris did in his example. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From andrew at bemusement.org Wed Oct 19 06:02:14 2011 From: andrew at bemusement.org (Andrew Bennetts) Date: Wed, 19 Oct 2011 15:02:14 +1100 Subject: [Python-ideas] Avoiding nested for try..finally: atexit for functions? In-Reply-To: <87pqhtafrz.fsf@vostro.rath.org> References: <87pqhtafrz.fsf@vostro.rath.org> Message-ID: <20111019040214.GA5524@flay.puzzling.org> On Tue, Oct 18, 2011 at 10:14:56PM -0400, Nikolaus Rath wrote: [...] > I would much rather have something like this: > > def my_fun(): > allocate_res1() > atreturn.register(cleanup_res1) > # do stuff > allocate_res2() > atreturn.register(cleanup_res2) > # do stuff > allocate_res3() > atreturn.register(cleanup_res3) > # do stuff > return > > Has the idea of implementing such "on return" handlers ever come up? > Maybe there is some tricky way to do this with function decorators? The "with" statement is a good answer. If for some reason you need to be compatible with version of Python so old it doesn't have it, then try the bzrlib.cleanup module in bzr. It implements the sort of API you describe above. And there are times when an API like that might be nicer to use anyway, e.g. when you have conditional allocations like: def foo(): res1 = allocate_res1() add_cleanup(res1.release) ... if cond: res2 = allocate_res2() add_cleanup(res2.release) res3 = allocate_res3() add_cleanup(res3.release) ... do_stuff() And avoiding the cascading indents of multiple with statements can be nice too. -Andrew. From aquavitae69 at gmail.com Wed Oct 19 08:44:40 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Wed, 19 Oct 2011 08:44:40 +0200 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <4E9751AF.4070705@canterbury.ac.nz> <1318554600.460.83.camel@Gutsy> <4E9AB9A4.9090905@pearwood.info> <4E9B773F.3010501@pearwood.info> <4E9BDEBC.2070401@canterbury.ac.nz> Message-ID: Regarding out of order execution: how is pep 3150 any worse than "x = y if z else w"? Apart from the fact than this sort of statement is usually quite short, the middle part still gets executed before the start. Either way, out of order execution hurts readability. There's a reason that > even mathematicians usually define terms before using them. On the other hand, I have read countless scientific papers which define functions along the lines of "this phenomenon can be represented by the equation x=2sin(yz), where x is the natural frequency, y is the height of the giraffe and z is the number of penguins". Just saying... Changing the subject slightly, I haven't studied the details of the proposed grammar, but if given can be used in simple statements, this implies that the following would be possible: def function_with_annotations() -> x given: x = 4: function_body() I assume I'm not alone in thinking that this is a really bad idea (if the current grammar does actually allow it). However, "given" in def statements would be a great way of dealing with the recently discussed question of how to define function variables at definition time: def function() given (len=len): ... David -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Wed Oct 19 11:43:39 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 19 Oct 2011 05:43:39 -0400 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <4E9751AF.4070705@canterbury.ac.nz> <1318554600.460.83.camel@Gutsy> <4E9AB9A4.9090905@pearwood.info> <4E9B773F.3010501@pearwood.info> <4E9BDEBC.2070401@canterbury.ac.nz> Message-ID: On 10/19/2011 2:44 AM, David Townshend wrote: > Regarding out of order execution: how is pep 3150 any worse than "x = y > if z else w"? A lot of people did not like that version of a conditional expression and argued against it. If it is used to justify other backwards constructs, I like it even less. > On the other hand, I have read countless scientific papers which define > functions along the lines of "this phenomenon can be represented by the > equation x=2sin(yz), where x is the natural frequency, y is the height > of the giraffe and z is the number of penguins". Just saying... That works once in a while, in a sentence. An important different is that assignment actions are different from equality (identity) statements. -- Terry Jan Reedy From ncoghlan at gmail.com Wed Oct 19 13:52:11 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 19 Oct 2011 21:52:11 +1000 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: References: <4E9751AF.4070705@canterbury.ac.nz> <1318554600.460.83.camel@Gutsy> <4E9AB9A4.9090905@pearwood.info> <4E9B773F.3010501@pearwood.info> <4E9BDEBC.2070401@canterbury.ac.nz> Message-ID: On Wed, Oct 19, 2011 at 7:43 PM, Terry Reedy wrote: > On 10/19/2011 2:44 AM, David Townshend wrote: >> >> Regarding out of order execution: how is pep 3150 any worse than "x = y >> if z else w"? > > A lot of people did not like that version of a conditional expression and > argued against it. If it is used to justify other backwards constructs, I > like it even less. I happened to be doing a lot of string formatting today, and it occurred to me that because Python doesn't allow implicit string interpolation, *all* string formatting is based on forward references - we define placeholders in the format strings and only fill them in later during the actual formatting call. PEP 3150 is essentially just about doing that for arbitrary expressions by permitting named placeholders that are filled in from the following suite. I agree that using any parallels with conditional expressions as justification for out of order execution is a terrible idea, though. The chosen syntax was definitely a case of "better than the error prone status quo and not as ugly as the other alternatives proposed" rather than "oh, this is a wonderful way to do it" :P Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From Nikolaus at rath.org Wed Oct 19 15:38:47 2011 From: Nikolaus at rath.org (Nikolaus Rath) Date: Wed, 19 Oct 2011 09:38:47 -0400 Subject: [Python-ideas] Avoiding nested for try..finally: atexit for functions? In-Reply-To: (Nick Coghlan's message of "Wed, 19 Oct 2011 13:58:07 +1000") References: <87pqhtafrz.fsf@vostro.rath.org> Message-ID: <87wrc1cd94.fsf@inspiron.ap.columbia.edu> Nick Coghlan writes: > On Wed, Oct 19, 2011 at 1:14 PM, Chris Rebert wrote: >> On Tue, Oct 18, 2011 at 7:14 PM, Nikolaus Rath wrote: >>> Hello, >>> >>> I often have code of the form: >>> >>> def my_fun(): >>> ? ?allocate_res1() >>> ? ?try: >>> ? ? ? # do stuff >>> ? ? ? allocate_res2() >>> ? ? ? try: >>> ? ? ? ? ? # do stuff >>> ? ? ? ? ? allocate_res3() >>> ? ? ? ? ? try: >>> ? ? ? ? ? ? ? # do stuff >>> ? ? ? ? ? finally: >>> ? ? ? ? ? ? ? cleanup_res3() >>> ? ? ? finally: >>> ? ? ? ? ? cleanup_res2() >>> ? ?finally: >>> ? ? ? ?cleanup_res1() >>> >>> ? ?return >>> >>> With increasing number of managed resources, the indentation becomes >>> really annoying, there is lots of line noise, and I don't like the fact >>> that the cleanup is so far away from the allocation. >> >> Use the `with` statement and context managers. They were added for >> this exact situation. >> See http://www.python.org/dev/peps/pep-0343/ >> >> Resulting code will resemble: >> >> def func(): >> ? ?with alloc() as res1, alloc() as res2, alloc() as res3: >> ? ? ? ?# do stuff > > Or, to more closely mirror the original example: > > # Define these wherever the current resources are defined > @contextlib.contextmanager > def cm1(): > res1 = allocate_res1() > try: > yield res1 > finally: > cleanup_res1() > > @contextlib.contextmanager > def cm2(): > res2 = allocate_res2() > try: > yield res2 > finally: > cleanup_res2() > > @contextlib.contextmanager > def cm3(): > res3 = allocate_res3() > try: > yield res2 > finally: > cleanup_res3() > > def func(): > with cm1() as res1: > # do stuff > with cm2() as res2: > # do stuff > with cm3() as res3: > # do stuff > Indeed, that works. But I do you really consider this code nicer than the original one? I think a simple line count answers the question :-). Best, -Nikolaus -- ?Time flies like an arrow, fruit flies like a Banana.? PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C From Nikolaus at rath.org Wed Oct 19 15:36:58 2011 From: Nikolaus at rath.org (Nikolaus Rath) Date: Wed, 19 Oct 2011 09:36:58 -0400 Subject: [Python-ideas] Avoiding nested for try..finally: atexit for functions? In-Reply-To: (Chris Rebert's message of "Tue, 18 Oct 2011 20:14:09 -0700") References: <87pqhtafrz.fsf@vostro.rath.org> Message-ID: <87zkgxcdc5.fsf@inspiron.ap.columbia.edu> Chris Rebert writes: > On Tue, Oct 18, 2011 at 7:14 PM, Nikolaus Rath wrote: >> Hello, >> >> I often have code of the form: >> >> def my_fun(): >> ? ?allocate_res1() >> ? ?try: >> ? ? ? # do stuff >> ? ? ? allocate_res2() >> ? ? ? try: >> ? ? ? ? ? # do stuff >> ? ? ? ? ? allocate_res3() >> ? ? ? ? ? try: >> ? ? ? ? ? ? ? # do stuff >> ? ? ? ? ? finally: >> ? ? ? ? ? ? ? cleanup_res3() >> ? ? ? finally: >> ? ? ? ? ? cleanup_res2() >> ? ?finally: >> ? ? ? ?cleanup_res1() >> >> ? ?return >> >> With increasing number of managed resources, the indentation becomes >> really annoying, there is lots of line noise, and I don't like the fact >> that the cleanup is so far away from the allocation. > > Use the `with` statement and context managers. They were added for > this exact situation. > See http://www.python.org/dev/peps/pep-0343/ > > Resulting code will resemble: > > def func(): > with alloc() as res1, alloc() as res2, alloc() as res3: > # do stuff I think they're not for exactly this situation for two reasons: 1. This requires the alloc() functions to be context managers. If they're not, then I need to code a wrapping context manager as well. It's probably possible to write a generic wrapper that works for any cleanup function, but the result is not going to look very nice. 2. If I don't want to allocate the resources all at the same time, the indentation mess is still the same: def func(): with alloc() as res1: # do stuff with alloc() as res2: # do stuff with alloc() as res3: # do stuff Best, -Nikolaus -- ?Time flies like an arrow, fruit flies like a Banana.? PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C From Nikolaus at rath.org Wed Oct 19 15:36:58 2011 From: Nikolaus at rath.org (Nikolaus Rath) Date: Wed, 19 Oct 2011 09:36:58 -0400 Subject: [Python-ideas] Avoiding nested for try..finally: atexit for functions? In-Reply-To: (Chris Rebert's message of "Tue, 18 Oct 2011 20:14:09 -0700") References: <87pqhtafrz.fsf@vostro.rath.org> Message-ID: <87zkgxcdc5.fsf@inspiron.ap.columbia.edu> Chris Rebert writes: > On Tue, Oct 18, 2011 at 7:14 PM, Nikolaus Rath wrote: >> Hello, >> >> I often have code of the form: >> >> def my_fun(): >> ? ?allocate_res1() >> ? ?try: >> ? ? ? # do stuff >> ? ? ? allocate_res2() >> ? ? ? try: >> ? ? ? ? ? # do stuff >> ? ? ? ? ? allocate_res3() >> ? ? ? ? ? try: >> ? ? ? ? ? ? ? # do stuff >> ? ? ? ? ? finally: >> ? ? ? ? ? ? ? cleanup_res3() >> ? ? ? finally: >> ? ? ? ? ? cleanup_res2() >> ? ?finally: >> ? ? ? ?cleanup_res1() >> >> ? ?return >> >> With increasing number of managed resources, the indentation becomes >> really annoying, there is lots of line noise, and I don't like the fact >> that the cleanup is so far away from the allocation. > > Use the `with` statement and context managers. They were added for > this exact situation. > See http://www.python.org/dev/peps/pep-0343/ > > Resulting code will resemble: > > def func(): > with alloc() as res1, alloc() as res2, alloc() as res3: > # do stuff I think they're not for exactly this situation for two reasons: 1. This requires the alloc() functions to be context managers. If they're not, then I need to code a wrapping context manager as well. It's probably possible to write a generic wrapper that works for any cleanup function, but the result is not going to look very nice. 2. If I don't want to allocate the resources all at the same time, the indentation mess is still the same: def func(): with alloc() as res1: # do stuff with alloc() as res2: # do stuff with alloc() as res3: # do stuff Best, -Nikolaus -- ?Time flies like an arrow, fruit flies like a Banana.? PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C From Nikolaus at rath.org Wed Oct 19 15:54:24 2011 From: Nikolaus at rath.org (Nikolaus Rath) Date: Wed, 19 Oct 2011 09:54:24 -0400 Subject: [Python-ideas] Avoiding nested for try..finally: atexit for functions? In-Reply-To: <20111019040214.GA5524@flay.puzzling.org> (Andrew Bennetts's message of "Wed, 19 Oct 2011 15:02:14 +1100") References: <87pqhtafrz.fsf@vostro.rath.org> <20111019040214.GA5524@flay.puzzling.org> Message-ID: <87lishccj3.fsf@inspiron.ap.columbia.edu> Andrew Bennetts writes: > On Tue, Oct 18, 2011 at 10:14:56PM -0400, Nikolaus Rath wrote: > [...] >> I would much rather have something like this: >> >> def my_fun(): >> allocate_res1() >> atreturn.register(cleanup_res1) >> # do stuff >> allocate_res2() >> atreturn.register(cleanup_res2) >> # do stuff >> allocate_res3() >> atreturn.register(cleanup_res3) >> # do stuff >> return >> >> Has the idea of implementing such "on return" handlers ever come up? >> Maybe there is some tricky way to do this with function decorators? > > The "with" statement is a good answer. If for some reason you need to > be compatible with version of Python so old it doesn't have it, then try > the bzrlib.cleanup module in bzr. It implements the sort of API you > describe above. Yes, that's sort of what I was thinking about. I think the API is still more convoluted than necessary, but that's probably because it works with Python 2.4. Having thought about this a bit more, I think it should be possible to use 'with' rather than decorators to implement something like this: with CleanupManager() as mngr: allocate_res1() mngr.register(cleanup_res1) # do stuff allocate_res2() mngr.register(cleanup_res2) # do stuff allocate_res3() mngr.register(cleanup_res3) # do stuff The mngr object would just run all the registered functions when the block is exited. Thoughts? Best, -Nikolaus -- ?Time flies like an arrow, fruit flies like a Banana.? PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C From p.f.moore at gmail.com Wed Oct 19 15:57:21 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 19 Oct 2011 14:57:21 +0100 Subject: [Python-ideas] Avoiding nested for try..finally: atexit for functions? In-Reply-To: <87pqhtafrz.fsf@vostro.rath.org> References: <87pqhtafrz.fsf@vostro.rath.org> Message-ID: On 19 October 2011 03:14, Nikolaus Rath wrote: > Hello, > > I often have code of the form: > > def my_fun(): > ? ?allocate_res1() > ? ?try: > ? ? ? # do stuff > ? ? ? allocate_res2() > ? ? ? try: > ? ? ? ? ? # do stuff > ? ? ? ? ? allocate_res3() > ? ? ? ? ? try: > ? ? ? ? ? ? ? # do stuff > ? ? ? ? ? finally: > ? ? ? ? ? ? ? cleanup_res3() > ? ? ? finally: > ? ? ? ? ? cleanup_res2() > ? ?finally: > ? ? ? ?cleanup_res1() > > ? ?return > > With increasing number of managed resources, the indentation becomes > really annoying, there is lots of line noise, and I don't like the fact > that the cleanup is so far away from the allocation. > > I would much rather have something like this: > > def my_fun(): > ? ?allocate_res1() > ? ?atreturn.register(cleanup_res1) > ? ?# do stuff > ? ?allocate_res2() > ? ?atreturn.register(cleanup_res2) > ? ?# do stuff > ? ?allocate_res3() > ? ?atreturn.register(cleanup_res3) > ? ?# do stuff > ? ?return > > Has the idea of implementing such "on return" handlers ever come up? > Maybe there is some tricky way to do this with function decorators? Here's a "tricky way with decorators" :-) : >>> def withexit(f): ... ex = [] ... def atex(g): ex.append(g) ... def wrapper(): ... f() ... for g in ex: g() ... wrapper.atex = atex ... return wrapper ... >>> def p1(): print "one" ... >>> def p2(): print "two" ... >>> @withexit ... def ff(): ... print 1 ... ff.atex(p1) ... print 2 ... ff.atex(p2) ... print 3 ... >>> ff() 1 2 3 one two >>> Paul. From p.f.moore at gmail.com Wed Oct 19 16:00:48 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 19 Oct 2011 15:00:48 +0100 Subject: [Python-ideas] Avoiding nested for try..finally: atexit for functions? In-Reply-To: <87lishccj3.fsf@inspiron.ap.columbia.edu> References: <87pqhtafrz.fsf@vostro.rath.org> <20111019040214.GA5524@flay.puzzling.org> <87lishccj3.fsf@inspiron.ap.columbia.edu> Message-ID: On 19 October 2011 14:54, Nikolaus Rath wrote: >> The "with" statement is a good answer. ?If for some reason you need to >> be compatible with version of Python so old it doesn't have it, then try >> the bzrlib.cleanup module in bzr. ?It implements the sort of API you >> describe above. > > Yes, that's sort of what I was thinking about. I think the API is still > more convoluted than necessary, but that's probably because it works > with Python 2.4. > > Having thought about this a bit more, I think it should be possible to > use 'with' rather than decorators to implement something like this: > > with CleanupManager() as mngr: > ? ? allocate_res1() > ? ? mngr.register(cleanup_res1) > ? ? # do stuff > ? ? allocate_res2() > ? ? mngr.register(cleanup_res2) > ? ? # do stuff > ? ? allocate_res3() > ? ? mngr.register(cleanup_res3) > ? ? # do stuff > > The mngr object would just run all the registered functions when the > block is exited. That's probably better than my decorator suggestion, because it allows you to limit the scope precisely, rather than just being function-scope. The CleanupManager class might make a good addition to contextlib, in actual fact... Paul. From jh at improva.dk Wed Oct 19 16:03:31 2011 From: jh at improva.dk (Jacob Holm) Date: Wed, 19 Oct 2011 16:03:31 +0200 Subject: [Python-ideas] Avoiding nested for try..finally: atexit for functions? In-Reply-To: <87pqhtafrz.fsf@vostro.rath.org> References: <87pqhtafrz.fsf@vostro.rath.org> Message-ID: <4E9ED8B3.4050607@improva.dk> On 2011-10-19 04:14, Nikolaus Rath wrote: > > I would much rather have something like this: > > def my_fun(): > allocate_res1() > atreturn.register(cleanup_res1) > # do stuff > allocate_res2() > atreturn.register(cleanup_res2) > # do stuff > allocate_res3() > atreturn.register(cleanup_res3) > # do stuff > return > > Has the idea of implementing such "on return" handlers ever come up? > Maybe there is some tricky way to do this with function decorators? > > How about a not-so-tricky solution using context managers? Something like (untested): import contextlib @contextlib.contextmanager def atwithexit(): handlers = [] try: yield handlers.append finally: for h in reversed(handlers): h() def my_fun(): with atwithexit() as atreturn: allocate_res1() atreturn(cleanup_res1) # do stuff allocate_res2() atreturn(cleanup_res2) # do stuff allocate_res3() atreturn(cleanup_res3) # do stuff return HTH - Jacob From Nikolaus at rath.org Wed Oct 19 16:43:19 2011 From: Nikolaus at rath.org (Nikolaus Rath) Date: Wed, 19 Oct 2011 10:43:19 -0400 Subject: [Python-ideas] Avoiding nested for try..finally: atexit for functions? In-Reply-To: (Paul Moore's message of "Wed, 19 Oct 2011 15:00:48 +0100") References: <87pqhtafrz.fsf@vostro.rath.org> <20111019040214.GA5524@flay.puzzling.org> <87lishccj3.fsf@inspiron.ap.columbia.edu> Message-ID: <87fwipca9k.fsf@inspiron.ap.columbia.edu> Paul Moore writes: > On 19 October 2011 14:54, Nikolaus Rath wrote: >>> The "with" statement is a good answer. ?If for some reason you need to >>> be compatible with version of Python so old it doesn't have it, then try >>> the bzrlib.cleanup module in bzr. ?It implements the sort of API you >>> describe above. >> >> Yes, that's sort of what I was thinking about. I think the API is still >> more convoluted than necessary, but that's probably because it works >> with Python 2.4. >> >> Having thought about this a bit more, I think it should be possible to >> use 'with' rather than decorators to implement something like this: >> >> with CleanupManager() as mngr: >> ? ? allocate_res1() >> ? ? mngr.register(cleanup_res1) >> ? ? # do stuff >> ? ? allocate_res2() >> ? ? mngr.register(cleanup_res2) >> ? ? # do stuff >> ? ? allocate_res3() >> ? ? mngr.register(cleanup_res3) >> ? ? # do stuff >> >> The mngr object would just run all the registered functions when the >> block is exited. > > That's probably better than my decorator suggestion, because it allows > you to limit the scope precisely, rather than just being > function-scope. The CleanupManager class might make a good addition to > contextlib, in actual fact... What would be the best way to handle errors during cleanup? Personally I would log them with logging.exception and discard them, but using the logging module is probably not a good option for contextlib, or is it? Best, -Nikolaus -- ?Time flies like an arrow, fruit flies like a Banana.? PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C From mikegraham at gmail.com Wed Oct 19 21:41:15 2011 From: mikegraham at gmail.com (Mike Graham) Date: Wed, 19 Oct 2011 15:41:15 -0400 Subject: [Python-ideas] PEP 355 (overloading boolean operations) and chained comparisons In-Reply-To: References: Message-ID: On Thu, Oct 13, 2011 at 12:43 AM, Raymond Hettinger wrote: > > Have you considered that what-is-good-for-numpy isn't necessarily good for Python as a whole? > Extended slicing and ellipsis tricks weren't so bad because they were easily ignored by general users. ?In contrast, rich comparisons have burdened everyone (we've paid a price in many ways). Rich comparisons added complication to Python, but was a very worthwhile feature. In addition to numpy, they are used by packages like sqlalchemy and sympy to give a more natural syntax to some operations. The introduction of rich comparisons also included the introduction of Notimplemented (IIRC), which adds even more complication but makes it possible to write more powerful code. __cmp__ also had a somewhat odd (though not unique) API, which I many times saw confuse learners. In any event, I don't think rich comparisons affect most users, who very seldom have an excuse to write a set of comparison operators (using __cmp__ or rich comparisons). > The numeric world really needs more operators than Python provides (a matrix multiplication operator for example), but I don't think Python is better-off by letting those needs leak back into the core language one-at-a-time. Having used numpy fairly extensively, I disagree. I don't mind having to call a normal function/method like solve, dot, conj, or transpose in circumstances where a language like Matlab would have a dedicated operator. In fact, I could argue for these things that what Python does Python's way is superior to Matlab's in particular, as most of these operators have or are related to problematic features or syntax. I do, however, regularly write "(a < b) & (b < c)" and hate it; a little observation reveals is it quite terrible. That being said, I think the fault might be as much numpy's as anything. An API like b.isbetween(a, c) or even (a < b).logicaland(b < c) would probably be nicer than the current typical solution. Though these fall short of being able to write a < b < c, which would be consistent and obvious, they would perhaps be enough to weaken the idea that a semantic change in Python could be beneficial. I'm still not seeing the great harm this will have on normal Python programmers who don't wish to overload boolean operators. Unlike rich comparisons, which deprecated the standard way to do thins, in this case the developer using Python can do the exact same thing she was doing all along and get the same results. Mike From yselivanov.ml at gmail.com Wed Oct 19 21:55:52 2011 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Wed, 19 Oct 2011 15:55:52 -0400 Subject: [Python-ideas] PEP 355 (overloading boolean operations) and chained comparisons In-Reply-To: References: Message-ID: Not to mention that support of overloadable boolean operations would be very beneficial for some ORMs too. As of now we have the only option to write something like: "ops.And(cond1, ops.Or(cond2, cond3))", or to use operators for bit logic with different precedence. - Yury On 2011-10-19, at 3:41 PM, Mike Graham wrote: > On Thu, Oct 13, 2011 at 12:43 AM, Raymond Hettinger > wrote: >> >> Have you considered that what-is-good-for-numpy isn't necessarily good for Python as a whole? >> Extended slicing and ellipsis tricks weren't so bad because they were easily ignored by general users. In contrast, rich comparisons have burdened everyone (we've paid a price in many ways). > > Rich comparisons added complication to Python, but was a very > worthwhile feature. In addition to numpy, they are used by packages > like sqlalchemy and sympy to give a more natural syntax to some > operations. The introduction of rich comparisons also included the > introduction of Notimplemented (IIRC), which adds even more > complication but makes it possible to write more powerful code. > > __cmp__ also had a somewhat odd (though not unique) API, which I many > times saw confuse learners. > > In any event, I don't think rich comparisons affect most users, who > very seldom have an excuse to write a set of comparison operators > (using __cmp__ or rich comparisons). > >> The numeric world really needs more operators than Python provides (a matrix multiplication operator for example), but I don't think Python is better-off by letting those needs leak back into the core language one-at-a-time. > > Having used numpy fairly extensively, I disagree. I don't mind having > to call a normal function/method like solve, dot, conj, or transpose > in circumstances where a language like Matlab would have a dedicated > operator. In fact, I could argue for these things that what Python > does Python's way is superior to Matlab's in particular, as most of > these operators have or are related to problematic features or syntax. > I do, however, regularly write "(a < b) & (b < c)" and hate it; a > little observation reveals is it quite terrible. > > That being said, I think the fault might be as much numpy's as > anything. An API like b.isbetween(a, c) or even (a < b).logicaland(b < > c) would probably be nicer than the current typical solution. Though > these fall short of being able to write a < b < c, which would be > consistent and obvious, they would perhaps be enough to weaken the > idea that a semantic change in Python could be beneficial. > > > I'm still not seeing the great harm this will have on normal Python > programmers who don't wish to overload boolean operators. Unlike rich > comparisons, which deprecated the standard way to do thins, in this > case the developer using Python can do the exact same thing she was > doing all along and get the same results. > > Mike > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From raymond.hettinger at gmail.com Wed Oct 19 22:25:57 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Wed, 19 Oct 2011 13:25:57 -0700 Subject: [Python-ideas] PEP 355 (overloading boolean operations) and chained comparisons In-Reply-To: References: Message-ID: On Oct 19, 2011, at 12:41 PM, Mike Graham wrote: > I'm still not seeing the great harm this will have on normal Python > programmers who don't wish to overload boolean operators. It is harmful. The and/or operators are not currently dependent on the underlying objects. They can be compiled and explained simply in terms of if's. They are control flow, not just logic operators. We explain short circuiting once and everybody gets it. But that changes if short-circuiting only happens with certain inputs. It makes it much harder to look at code and know what does. I'm reminded of the effort to make "is" over-loadable. It finally go shot down because it so profoundly messed with people's understanding of identity and because is would not longer be possible to easily reason about code (i.e. it becomes difficult to assure that simple container code is correct without knowing what kind of objects were going to be stored in the container). In a way, the and/or/not overloading suggestion is worse than rich comparisons because even the simplest "a and b" expression would have to be compiled in a profoundly different way (and the related peephole optimizations would not longer be valid). Everyone would pay the price. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Wed Oct 19 22:36:25 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 19 Oct 2011 16:36:25 -0400 Subject: [Python-ideas] PEP 355 (overloading boolean operations) and chained comparisons In-Reply-To: References: Message-ID: On Wed, Oct 19, 2011 at 4:25 PM, Raymond Hettinger wrote: .. > I'm reminded of the effort to make "is" over-loadable. > It finally go shot down because it so profoundly messed > with people's understanding I would even mention the whitespace overloading "idea": http://www2.research.att.com/~bs/whitespace98.pdf Even in C++, not everything can be overloaded. Some applications are best served by a special purpose language rather than by adding features to a general purpose one. From ziade.tarek at gmail.com Wed Oct 19 23:32:21 2011 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Wed, 19 Oct 2011 23:32:21 +0200 Subject: [Python-ideas] The way decorators are parsng Message-ID: Hello Today I've tried to write a one-liner for a decorator, The decorator is a method in a class. I wanted to do something like this: @Class().decorator() def function(): ... That threw a syntax error to my surprise. But the semantic is correct, since I am currently writing: obj = Class() @obj.decorator() def function(): ... And I can also write dec = Class().decorator @dec() def function(): ... Is there something obvious I am missing, or is there a weird thing in the way decoratirs are parsed ? Demo: >>> class Some(object): ... def stuff(self, func): ... return func ... >>> s = Some() >>> @s.stuff ... def ok(): ... print 'ok' ... >>> ok() ok >>> s = Some().stuff >>> @s ... def ok(): ... print 'ok' ... >>> ok() ok >>> @Some().stuff File "", line 1 @Some().stuff ^ SyntaxError: invalid syntax Cheers Tarek -- Tarek Ziad? | http://ziade.org From pyideas at rebertia.com Wed Oct 19 23:43:41 2011 From: pyideas at rebertia.com (Chris Rebert) Date: Wed, 19 Oct 2011 14:43:41 -0700 Subject: [Python-ideas] The way decorators are parsng In-Reply-To: References: Message-ID: > On Wed, Oct 19, 2011 at 2:32 PM, Tarek Ziad? wrote: > Hello > > Today I've tried to write a one-liner for a decorator, The decorator > is a method in a class. > > I wanted to do something like this: > > @Class().decorator() > def function(): > ? ?... > > That threw a syntax error to my surprise. > Is there something obvious I am missing, or is there a weird thing in > the way decoratirs are parsed ? PEP 318 -- Decorators for Functions and Methods (http://www.python.org/dev/peps/pep-0318/ ): "Current Syntax [...] The decorator statement is limited in what it can accept -- arbitrary expressions will not work. Guido preferred this because of a gut feeling [17]." [17]: http://mail.python.org/pipermail/python-dev/2004-August/046711.html According to Python 2.7's grammar (http://docs.python.org/reference/grammar.html ): decorator: '@' dotted_name [ '(' [arglist] ')' ] NEWLINE dotted_name: NAME ('.' NAME)* So, you're limited to an arbitrarily-long sequence of attribute accesses, followed by an optional call. Cheers, Chris -- http://rebertia.com From ericsnowcurrently at gmail.com Wed Oct 19 23:51:23 2011 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 19 Oct 2011 15:51:23 -0600 Subject: [Python-ideas] The way decorators are parsng In-Reply-To: References: Message-ID: On Wed, Oct 19, 2011 at 3:43 PM, Chris Rebert wrote: >> On Wed, Oct 19, 2011 at 2:32 PM, Tarek Ziad? wrote: >> Hello >> >> Today I've tried to write a one-liner for a decorator, The decorator >> is a method in a class. >> >> I wanted to do something like this: >> >> @Class().decorator() >> def function(): >> ? ?... >> >> That threw a syntax error to my surprise. > >> Is there something obvious I am missing, or is there a weird thing in >> the way decoratirs are parsed ? > > PEP 318 -- Decorators for Functions and Methods > (http://www.python.org/dev/peps/pep-0318/ ): > "Current Syntax > [...] > The decorator statement is limited in what it can accept -- arbitrary > expressions will not work. Guido preferred this because of a gut > feeling [17]." > [17]: http://mail.python.org/pipermail/python-dev/2004-August/046711.html > > According to Python 2.7's grammar > (http://docs.python.org/reference/grammar.html ): > ? ?decorator: '@' dotted_name [ '(' [arglist] ')' ] NEWLINE > ? ?dotted_name: NAME ('.' NAME)* > > So, you're limited to an arbitrarily-long sequence of attribute > accesses, followed by an optional call. Interesting. I would have expected at least the following to work: decorator: '@' NAME trailer* NEWLINE trailer: '(' [arglist] ')' | '[' subscriptlist ']' | '.' NAME Regardless, good to know, even if uncommon. -eric > > Cheers, > Chris > -- > http://rebertia.com > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From ziade.tarek at gmail.com Wed Oct 19 23:52:53 2011 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Wed, 19 Oct 2011 23:52:53 +0200 Subject: [Python-ideas] The way decorators are parsng In-Reply-To: References: Message-ID: On Wed, Oct 19, 2011 at 11:43 PM, Chris Rebert wrote: > >> Is there something obvious I am missing, or is there a weird thing in >> the way decoratirs are parsed ? > > PEP 318 -- Decorators for Functions and Methods > (http://www.python.org/dev/peps/pep-0318/ ): > "Current Syntax > [...] > The decorator statement is limited in what it can accept -- arbitrary > expressions will not work. Guido preferred this because of a gut > feeling [17]." > [17]: http://mail.python.org/pipermail/python-dev/2004-August/046711.html > > According to Python 2.7's grammar > (http://docs.python.org/reference/grammar.html ): > ? ?decorator: '@' dotted_name [ '(' [arglist] ')' ] NEWLINE > ? ?dotted_name: NAME ('.' NAME)* > > So, you're limited to an arbitrarily-long sequence of attribute > accesses, followed by an optional call. Thanks for the pointers > > Cheers, > Chris > -- > http://rebertia.com > -- Tarek Ziad? | http://ziade.org From greg.ewing at canterbury.ac.nz Thu Oct 20 00:25:06 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 20 Oct 2011 11:25:06 +1300 Subject: [Python-ideas] PEP 355 (overloading boolean operations) and chained comparisons In-Reply-To: References: Message-ID: <4E9F4E42.6000006@canterbury.ac.nz> Raymond Hettinger wrote: > In a way, the and/or/not overloading suggestion is worse > than rich comparisons because even the simplest > "a and b" expression would have to be compiled in > a profoundly different way (and the related peephole > optimizations would not longer be valid). What peephole optimisations are currently applied to boolean expressions? -- Greg From p.f.moore at gmail.com Thu Oct 20 00:25:21 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 19 Oct 2011 23:25:21 +0100 Subject: [Python-ideas] PEP 355 (overloading boolean operations) and chained comparisons In-Reply-To: References: Message-ID: On 19 October 2011 21:25, Raymond Hettinger wrote: > It is harmful. ?The and/or operators are not currently dependent > on the underlying objects. ?They can be compiled and explained > simply in terms of if's. ?They are control flow, not just logic operators. > We explain short circuiting once and everybody gets it. > But that changes if short-circuiting only happens with certain inputs. > It makes it much harder to look at code and know what does. > I'm reminded of the effort to make "is" over-loadable. > It finally go shot down because it so profoundly messed > with people's understanding of identity and because is > would not longer be possible to easily reason about code > (i.e. it becomes difficult to assure that simple container code > is correct without knowing what kind of objects were going > to be stored in the container). > In a way, the and/or/not overloading suggestion is worse > than rich comparisons because even the simplest > "a and b" expression would have to be compiled in > a profoundly different way (and the related peephole > optimizations would not longer be valid). ?Everyone > would pay the price. An interesting point is that while the proposal is about overloading the logical operators, many of the arguments in favour are referring to chained comparisons. If the rich comparison mechanisms were somehow extended to cover chained comparisons, would that satisfy people's requirements without needing logical operator overloading? I'm not saying I agree with the idea of overloading chained comparisons either, just wondering if a less ambitious proposal would be of any value. Personally, I have no use for any of this... Paul. From ncoghlan at gmail.com Thu Oct 20 00:27:20 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 20 Oct 2011 08:27:20 +1000 Subject: [Python-ideas] PEP 355 (overloading boolean operations) and chained comparisons In-Reply-To: References: Message-ID: On Thu, Oct 20, 2011 at 6:25 AM, Raymond Hettinger wrote: > In a way, the and/or/not overloading suggestion is worse > than rich comparisons because even the simplest > "a and b" expression would have to be compiled in > a profoundly different way (and the related peephole > optimizations would not longer be valid). ?Everyone > would pay the price. Indeed. I actually think adding '&&' and '||' for the binary logical operator purposes described in PEP 355 would be a preferable alternative to messing with the meaning of 'and' and 'or' as flow control expressions. The meaning of chained comparisons could then also be updated accordingly so that "a < b < c" translated to "a < b && b < c" if the result of "a < b" overloaded the logical and operation, but would still short circuit otherwise. I'm not saying I think that's necessarily a *good* idea - I'm just saying I dislike it less than the approach currently proposed by the PEP. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From greg.ewing at canterbury.ac.nz Thu Oct 20 00:35:32 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 20 Oct 2011 11:35:32 +1300 Subject: [Python-ideas] PEP 355 (overloading boolean operations) and chained comparisons In-Reply-To: References: Message-ID: <4E9F50B4.8000306@canterbury.ac.nz> Paul Moore wrote: > An interesting point is that while the proposal is about overloading > the logical operators, many of the arguments in favour are referring > to chained comparisons. Not really -- the matter of chained comparisons was only brought up recently. There's much more behind it than that. > If the rich comparison mechanisms were somehow > extended to cover chained comparisons, would that satisfy people's > requirements without needing logical operator overloading? It wouldn't satisfy any of the use cases I had in mind when I wrote the PEP. -- Greg From ncoghlan at gmail.com Thu Oct 20 00:36:10 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 20 Oct 2011 08:36:10 +1000 Subject: [Python-ideas] The way decorators are parsng In-Reply-To: References: Message-ID: On Thu, Oct 20, 2011 at 7:52 AM, Tarek Ziad? wrote: >> So, you're limited to an arbitrarily-long sequence of attribute >> accesses, followed by an optional call. > > Thanks for the pointers In the time since, Guido gave his approval to removing the restriction, but nobody has been interested enough to actually implement the change. He was persuaded the restriction was pointless largely due to people using tricks like the following to avoid it: def deco(x): return x @deco(anything[I].like().can.go[here].and_the.compiler.will_not(care)) def f(): pass (There were also legitimate use cases related to looking up decorators via a subscript rather than a function call). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From raymond.hettinger at gmail.com Thu Oct 20 02:16:29 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Wed, 19 Oct 2011 17:16:29 -0700 Subject: [Python-ideas] PEP 355 (overloading boolean operations) and chained comparisons In-Reply-To: <4E9F4E42.6000006@canterbury.ac.nz> References: <4E9F4E42.6000006@canterbury.ac.nz> Message-ID: <8D7D56ED-93B7-42A0-AC06-8FAAEACA5AC1@gmail.com> On Oct 19, 2011, at 3:25 PM, Greg Ewing wrote: > What peephole optimisations are currently applied > to boolean expressions? Here's the comment from Python/peephole.c: /* Simplify conditional jump to conditional jump where the result of the first test implies the success of a similar test or the failure of the opposite test. Arises in code like: "if a and b:" "if a or b:" "a and b or c" "(a and b) and c" x:JUMP_IF_FALSE_OR_POP y y:JUMP_IF_FALSE_OR_POP z --> x:JUMP_IF_FALSE_OR_POP z x:JUMP_IF_FALSE_OR_POP y y:JUMP_IF_TRUE_OR_POP z --> x:POP_JUMP_IF_FALSE y+3 where y+3 is the instruction following the second test. */ Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From zuo at chopin.edu.pl Thu Oct 20 02:19:22 2011 From: zuo at chopin.edu.pl (Jan Kaliszewski) Date: Thu, 20 Oct 2011 02:19:22 +0200 Subject: [Python-ideas] Avoiding nested for try..finally: atexit for functions? In-Reply-To: <87fwipca9k.fsf@inspiron.ap.columbia.edu> References: <87pqhtafrz.fsf@vostro.rath.org> <20111019040214.GA5524@flay.puzzling.org> <87lishccj3.fsf@inspiron.ap.columbia.edu> <87fwipca9k.fsf@inspiron.ap.columbia.edu> Message-ID: <20111020001922.GA2191@chopin.edu.pl> Nikolaus Rath dixit (2011-10-19, 10:43): > >> with CleanupManager() as mngr: > >> ? ? allocate_res1() > >> ? ? mngr.register(cleanup_res1) > >> ? ? # do stuff > >> ? ? allocate_res2() > >> ? ? mngr.register(cleanup_res2) > >> ? ? # do stuff > >> ? ? allocate_res3() > >> ? ? mngr.register(cleanup_res3) > >> ? ? # do stuff > >> > >> The mngr object would just run all the registered functions when the > >> block is exited. [snip] > What would be the best way to handle errors during cleanup? Personally I > would log them with logging.exception and discard them, but using the > logging module is probably not a good option for contextlib, or is it? I'd suggest something like the following: class CleanupManager: _default_error_handler = lambda exc_type, exc_value, tb: False def __init__(self, error_handler=_default_error_handler): self.error_handler = error_handler self.cleanup_callbacks = [] def register(self, callback): self.cleanup_callbacks.append(callback) def __enter__(self): return self def __exit__(self, exc_type, exc_value, tb): try: if exc_value is not None: # if returns True, exception will be suppressed... return self.error_handler(exc_type, exc_value, tb) finally: # ...except something wrong happen when using callbacks self._next_callback() def _next_callback(self): if self.cleanup_callbacks: callback = self.cleanup_callbacks.pop() try: callback() finally: # all callbacks to be used + all errors to be reported self._next_callback() Then a user can specify any error callback they need, e.g.: >>> def error_printer(exc_type, exc_value, tb): ... print(exc_type.__name__, exc_value) ... return True ... >>> with CleanupManager(error_printer) as cm: ... cm.register(lambda: print(1)) ... cm.register(lambda: print(2)) ... raise ValueError('spam') ... ValueError spam 2 1 Please also note that all cleanup callbacks will be used and, at the same time, no exception will remain unnoticed -- that within the with-block (handled with the error handler), but also that from the error handler as well as those from all cleanup handlers (as long as we talk about Py3.x, with its cool exception chaining feature): >>> erroneous_error_handler = (lambda exc_tp, exc, tb: ''/3) >>> with CleanupManager(erroneous_error_handler) as cm: ... cm.register(lambda: 1/0) # error (division by 0) ... cm.register(lambda: 44+'') # error (bad operand type) ... raise ValueError('spam') # error ... Traceback (most recent call last): File "", line 4, in ValueError: spam During handling of the above exception, another exception occurred: Traceback (most recent call last): File "cleanup_manager.py", line 19, in __exit__ return self.error_handler(exc_type, exc_value, tb) File "", line 1, in TypeError: unsupported operand type(s) for /: 'str' and 'int' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "cleanup_manager.py", line 28, in _next_callback callback() File "", line 3, in TypeError: unsupported operand type(s) for +: 'int' and 'str' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "", line 4, in File "cleanup_manager.py", line 22, in __exit__ self._next_callback() File "cleanup_manager.py", line 31, in _next_callback self._next_callback() File "cleanup_manager.py", line 28, in _next_callback callback() File "", line 2, in ZeroDivisionError: division by zero Cheers. *j From greg.ewing at canterbury.ac.nz Thu Oct 20 02:30:45 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 20 Oct 2011 13:30:45 +1300 Subject: [Python-ideas] PEP 355 (overloading boolean operations) and chained comparisons In-Reply-To: <8D7D56ED-93B7-42A0-AC06-8FAAEACA5AC1@gmail.com> References: <4E9F4E42.6000006@canterbury.ac.nz> <8D7D56ED-93B7-42A0-AC06-8FAAEACA5AC1@gmail.com> Message-ID: <4E9F6BB5.8020803@canterbury.ac.nz> On 20/10/11 13:16, Raymond Hettinger wrote: > /* Simplify conditional jump to conditional jump where the > result of the first test implies the success of a similar > test or the failure of the opposite test. > Arises in code like: > "if a and b:" > "if a or b:" > "a and b or c" > "(a and b) and c" > x:JUMP_IF_FALSE_OR_POP y y:JUMP_IF_FALSE_OR_POP z > --> x:JUMP_IF_FALSE_OR_POP z > x:JUMP_IF_FALSE_OR_POP y y:JUMP_IF_TRUE_OR_POP z > --> x:POP_JUMP_IF_FALSE y+3 > where y+3 is the instruction following the second test. > */ While the existing peephole optimisations wouldn't work as-is, there's no reason that similarly efficient code couldn't be generated, either by peephole or using a different compilation strategy to begin with. There are some comments about this in the draft PEP update that I'll try to get submitted soon. -- Greg From Nikolaus at rath.org Thu Oct 20 04:19:10 2011 From: Nikolaus at rath.org (Nikolaus Rath) Date: Wed, 19 Oct 2011 22:19:10 -0400 Subject: [Python-ideas] Avoiding nested for try..finally: atexit for functions? In-Reply-To: <20111020001922.GA2191@chopin.edu.pl> (Jan Kaliszewski's message of "Thu, 20 Oct 2011 02:19:22 +0200") References: <87pqhtafrz.fsf@vostro.rath.org> <20111019040214.GA5524@flay.puzzling.org> <87lishccj3.fsf@inspiron.ap.columbia.edu> <87fwipca9k.fsf@inspiron.ap.columbia.edu> <20111020001922.GA2191@chopin.edu.pl> Message-ID: <87mxcw1k2p.fsf@vostro.rath.org> Jan Kaliszewski writes: > Nikolaus Rath dixit (2011-10-19, 10:43): > >> >> with CleanupManager() as mngr: >> >> ? ? allocate_res1() >> >> ? ? mngr.register(cleanup_res1) >> >> ? ? # do stuff >> >> ? ? allocate_res2() >> >> ? ? mngr.register(cleanup_res2) >> >> ? ? # do stuff >> >> ? ? allocate_res3() >> >> ? ? mngr.register(cleanup_res3) >> >> ? ? # do stuff >> >> >> >> The mngr object would just run all the registered functions when the >> >> block is exited. > [snip] >> What would be the best way to handle errors during cleanup? Personally I >> would log them with logging.exception and discard them, but using the >> logging module is probably not a good option for contextlib, or is it? > > I'd suggest something like the following: > > class CleanupManager: > > _default_error_handler = lambda exc_type, exc_value, tb: False > > def __init__(self, error_handler=_default_error_handler): > self.error_handler = error_handler > self.cleanup_callbacks = [] > > def register(self, callback): > self.cleanup_callbacks.append(callback) > > def __enter__(self): > return self > > def __exit__(self, exc_type, exc_value, tb): > try: > if exc_value is not None: > # if returns True, exception will be suppressed... > return self.error_handler(exc_type, exc_value, tb) > finally: > # ...except something wrong happen when using callbacks > self._next_callback() > > def _next_callback(self): > if self.cleanup_callbacks: > callback = self.cleanup_callbacks.pop() > try: > callback() > finally: > # all callbacks to be used + all errors to be reported > self._next_callback() > [...] > > Please also note that all cleanup callbacks will be used and, at the same time, > no exception will remain unnoticed -- that within the with-block (handled with > the error handler), but also that from the error handler as well as those from > all cleanup handlers (as long as we talk about Py3.x, with its cool exception > chaining feature): Wow, that's really neat! I was in Python 2.x mode and would have tried to iterate over the callbacks, not knowing what to do with any exceptions from them. That said, do you have a suggestion for Python 2.7 as well? Maybe something like a cleanup error handler? class CleanupManager: _default_error_handler = lambda exc_type, exc_value, tb: False _default_cleanup_error_handler = lambda exc_type, exc_value, tb: True def __init__(self, error_handler=_default_error_handler, cleanup_error_handler=_default_cleanup_error_handler): self.error_handler = error_handler self.cleanup_error_handler = cleanup_error_handler self.cleanup_callbacks = [] def register(self, callback): self.cleanup_callbacks.append(callback) def __enter__(self): return self def __exit__(self, exc_type, exc_value, tb): try: if exc_value is not None: # if returns True, exception will be suppressed... return self.error_handler(exc_type, exc_value, tb) finally: # Saves the exception that we are going to raise at the # end (if any) exc_info = None for cb in self.cleanup_callbacks: try: cb() except: # If returns true, ignore exceptions during cleanup if self.cleanup_error_handler(*sys.exc_info()): pass else: # Only first exception gets propagated if not exc_info: exc_info = sys.exc_info() if exc_info: raise exc_info[0], exc_info[1], exc_info[2] Best, -Nikolaus -- ?Time flies like an arrow, fruit flies like a Banana.? PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C From guido at python.org Thu Oct 20 04:20:58 2011 From: guido at python.org (Guido van Rossum) Date: Wed, 19 Oct 2011 19:20:58 -0700 Subject: [Python-ideas] The way decorators are parsng In-Reply-To: References: Message-ID: On Wed, Oct 19, 2011 at 3:36 PM, Nick Coghlan wrote: > On Thu, Oct 20, 2011 at 7:52 AM, Tarek Ziad? wrote: >>> So, you're limited to an arbitrarily-long sequence of attribute >>> accesses, followed by an optional call. >> >> Thanks for the pointers > > In the time since, Guido gave his approval to removing the > restriction, but nobody has been interested enough to actually > implement the change. He was persuaded the restriction was pointless > largely due to people using tricks like the following to avoid it: > > def deco(x): > ? ?return x > > @deco(anything[I].like().can.go[here].and_the.compiler.will_not(care)) > def f(): > ? ?pass > > (There were also legitimate use cases related to looking up decorators > via a subscript rather than a function call). If this gets changed we won't be able to give a different meaning to e.g. @(...) @[...] @{...} since those will all have to be accepted as valid forms of the syntax @ -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Thu Oct 20 05:41:17 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 20 Oct 2011 13:41:17 +1000 Subject: [Python-ideas] The way decorators are parsng In-Reply-To: References: Message-ID: 2011/10/20 Guido van Rossum : > If this gets changed we won't be able to give a different meaning to e.g. > > @(...) > @[...] > @{...} > > since those will all have to be accepted as valid forms of the syntax > > @ True, although the restriction could just be weakened to "must start with an identifier" rather than eliminated entirely. Since tuples, lists, dictionaries and sets aren't callable, that wouldn't be a noticeable restriction in practice. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From guido at python.org Thu Oct 20 05:48:45 2011 From: guido at python.org (Guido van Rossum) Date: Wed, 19 Oct 2011 20:48:45 -0700 Subject: [Python-ideas] The way decorators are parsng In-Reply-To: References: Message-ID: 2011/10/19 Nick Coghlan : > 2011/10/20 Guido van Rossum : >> If this gets changed we won't be able to give a different meaning to e.g. >> >> @(...) >> @[...] >> @{...} >> >> since those will all have to be accepted as valid forms of the syntax >> >> @ > > True, although the restriction could just be weakened to "must start > with an identifier" rather than eliminated entirely. Since tuples, > lists, dictionaries and sets aren't callable, that wouldn't be a > noticeable restriction in practice. But surely someone would manage to come up with a use case for an expression *starting* with one of those, e.g. @[f, g, h][i] or @{a: b, c: d}[x] I don't think it's reasonable to constrain it less than it currently is but more than a general expression. Though I wouldn't allow commas -- there's no way that @f, g def pooh(): ... can make sense. Oh way, it could be a shorthand for @f @g def pooh(): ... :-) -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Thu Oct 20 06:37:16 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 20 Oct 2011 14:37:16 +1000 Subject: [Python-ideas] An illustrative example for PEP 3150's statement local namespaces Message-ID: A task I actually ran into at work this week: get a filtered list of subdirectories, exclude some based on a list of names to be ignored, sort the remainder by their modification times. (This problem was actually also the origin of my recent filter_walk recipe: http://code.activestate.com/recipes/577913-selective-directory-walking/) Translated to 3.x (i.e. the generator's .next() method is replaced by the next() builtin), the code would look roughly like this: # Generate list of candidate directories sorted by modification time candidates = next(filter_walk(base_dir, dir_pattern=dir_filter, depth=0)).subdirs candidates = (subdir for subdir in candidates if not any(d in subdir for d in dirs_to_ignore)) def get_mtime(path): stat_path = os.path.join(base_dir, path) return os.stat(stat_path).st_mtime candidates = sorted(candidates, key=get_mtime) Now, that could theoretically be split out to a separate function (passing base_dir, dir_filter and dirs_to_ignore as arguments), but the details are going to vary too much from use case to use case to make reusing it practical. Even factoring out "get_mtime" would be a waste, since you end up with a combinatorial explosion of functions if you try to do things like that (it's the local code base equivalent of "not every 3 line function needs to be in the standard library"). I can (and do) use vertical white space to give some indication that the calculation is complete, but PEP 3150 would allow me to be even more explicit by indenting every step in the calculation except the last one: candidate_dirs = sorted(candidate_dirs, key=get_mtime) given: candidate_dirs = next(filter_walk(base_dir, dir_pattern=dir_filter, depth=0)).subdirs candidate_dirs = (subdir for subdir in candidates if not any(d in subdir for d in dirs_to_ignore)) def get_mtime(path): stat_path = os.path.join(base_dir, path) return os.stat(stat_path).st_mtime Notice how the comment from the original version becomes redundant in the second version? It's just repeating what the actual header line right below it says, so I got rid of it. In the original version it was necessary because there was no indentation in the code to indicate that this was all just different stages of one internal calculation leading up to that final step to create the sorted list. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From sven at marnach.net Thu Oct 20 16:27:08 2011 From: sven at marnach.net (Sven Marnach) Date: Thu, 20 Oct 2011 15:27:08 +0100 Subject: [Python-ideas] PEP 355 (overloading boolean operations) and chained comparisons In-Reply-To: References: Message-ID: <20111020142708.GB20970@pantoffel-wg.de> Mike Graham wrote: > I do, however, regularly write "(a < b) & (b < c)" and hate it; a > little observation reveals is it quite terrible. It might not be the nicest syntax ever, but I still find this quite readable. Of course 'a < b < c' looks nicer, but it's not that big a deal. > That being said, I think the fault might be as much numpy's as > anything. An API like b.isbetween(a, c) or even (a < b).logicaland(b < > c) would probably be nicer than the current typical solution. Just for the record, NumPy already allows the syntax logical_and(a < b, b < c) Cheers, Sven From sven at marnach.net Thu Oct 20 17:11:15 2011 From: sven at marnach.net (Sven Marnach) Date: Thu, 20 Oct 2011 16:11:15 +0100 Subject: [Python-ideas] Avoiding nested for try..finally: atexit for functions? In-Reply-To: <20111020001922.GA2191@chopin.edu.pl> References: <87pqhtafrz.fsf@vostro.rath.org> <20111019040214.GA5524@flay.puzzling.org> <87lishccj3.fsf@inspiron.ap.columbia.edu> <87fwipca9k.fsf@inspiron.ap.columbia.edu> <20111020001922.GA2191@chopin.edu.pl> Message-ID: <20111020151115.GC20970@pantoffel-wg.de> Jan Kaliszewski schrieb am Do, 20. Okt 2011, um 02:19:22 +0200: > class CleanupManager: > > _default_error_handler = lambda exc_type, exc_value, tb: False > > def __init__(self, error_handler=_default_error_handler): > self.error_handler = error_handler > self.cleanup_callbacks = [] > > def register(self, callback): > self.cleanup_callbacks.append(callback) > > def __enter__(self): > return self > > def __exit__(self, exc_type, exc_value, tb): > try: > if exc_value is not None: > # if returns True, exception will be suppressed... > return self.error_handler(exc_type, exc_value, tb) > finally: > # ...except something wrong happen when using callbacks > self._next_callback() > > def _next_callback(self): > if self.cleanup_callbacks: > callback = self.cleanup_callbacks.pop() > try: > callback() > finally: > # all callbacks to be used + all errors to be reported > self._next_callback() Why introduce an error handler at all? If you want to handle exceptions inside the with statement, an explicit try-except block would be both more convenient (you don't need an extra function definition) and more flexible (you can use - say - 'except ValueError:' to just catch some exceptions). To give an example, I'd really prefer with CleanupManager() as cleanup: try: # do stuff except ValueError: # handle exception over def handle_value_error(exc_type, exc_value, exc_tb): if issubclass(exc_type, ValueError): # handle exception return True return False with CleanupManager(handle_value_error) as cleanup: # do stuff Instead of an error handler, I'd rather accept the first (or more) callbacks as constructor parameters -- there wouldn't be any point in starting the with block if you weren't about to add a callback in the very next statement, so we could as well accept it as a parameter to the constructor. > Please also note that all cleanup callbacks will be used and, at the same time, > no exception will remain unnoticed -- that within the with-block (handled with > the error handler), but also that from the error handler as well as those from > all cleanup handlers (as long as we talk about Py3.x, with its cool exception > chaining feature): > > >>> erroneous_error_handler = (lambda exc_tp, exc, tb: ''/3) > >>> with CleanupManager(erroneous_error_handler) as cm: > ... cm.register(lambda: 1/0) # error (division by 0) > ... cm.register(lambda: 44+'') # error (bad operand type) > ... raise ValueError('spam') # error > ... > Traceback (most recent call last): > File "", line 4, in > ValueError: spam > > During handling of the above exception, another exception occurred: > > Traceback (most recent call last): > File "cleanup_manager.py", line 19, in __exit__ > return self.error_handler(exc_type, exc_value, tb) > File "", line 1, in > TypeError: unsupported operand type(s) for /: 'str' and 'int' > > During handling of the above exception, another exception occurred: > > Traceback (most recent call last): > File "cleanup_manager.py", line 28, in _next_callback > callback() > File "", line 3, in > TypeError: unsupported operand type(s) for +: 'int' and 'str' > > During handling of the above exception, another exception occurred: > > Traceback (most recent call last): > File "", line 4, in > File "cleanup_manager.py", line 22, in __exit__ > self._next_callback() > File "cleanup_manager.py", line 31, in _next_callback > self._next_callback() > File "cleanup_manager.py", line 28, in _next_callback > callback() > File "", line 2, in > ZeroDivisionError: division by zero Note that the only exception you could easily catch in this example is the ZeroDivisionError. try: erroneous_error_handler = (lambda exc_tp, exc, tb: ''/3) with CleanupManager(erroneous_error_handler) as cm: cm.register(lambda: 1/0) # error (division by 0) cm.register(lambda: 44+'') # error (bad operand type) raise ValueError('spam') # error except ValueError as e: print(e) won't catch the ValueError. Cheers, Sven From sven at marnach.net Thu Oct 20 17:44:46 2011 From: sven at marnach.net (Sven Marnach) Date: Thu, 20 Oct 2011 16:44:46 +0100 Subject: [Python-ideas] Avoiding nested for try..finally: atexit for functions? In-Reply-To: <20111020001922.GA2191@chopin.edu.pl> References: <87pqhtafrz.fsf@vostro.rath.org> <20111019040214.GA5524@flay.puzzling.org> <87lishccj3.fsf@inspiron.ap.columbia.edu> <87fwipca9k.fsf@inspiron.ap.columbia.edu> <20111020001922.GA2191@chopin.edu.pl> Message-ID: <20111020154446.GD20970@pantoffel-wg.de> Jan Kaliszewski schrieb am Do, 20. Okt 2011, um 02:19:22 +0200: > def register(self, callback): > self.cleanup_callbacks.append(callback) We should probably accept additional *args and **kwargs which will be passed on to the callback, similar to atexit.register(). And possibly add an unregister() method in analogy to atexit.unregister() as well? Cheers, Sven From Nikolaus at rath.org Thu Oct 20 19:41:33 2011 From: Nikolaus at rath.org (Nikolaus Rath) Date: Thu, 20 Oct 2011 13:41:33 -0400 Subject: [Python-ideas] An illustrative example for PEP 3150's statement local namespaces In-Reply-To: (Nick Coghlan's message of "Thu, 20 Oct 2011 14:37:16 +1000") References: Message-ID: <874nz3a7ci.fsf@inspiron.ap.columbia.edu> Nick Coghlan writes: > A task I actually ran into at work this week: get a filtered list of > subdirectories, exclude some based on a list of names to be ignored, > sort the remainder by their modification times. [...] Some unrelated feedback to that PEP (very neat idea, IMO): 1. Shouldn't: ,---- | # Current Python (manual namespace cleanup) | def _createenviron(): | ... # 27 line function | | environ = _createenviron() | del _createenviron | | # Becomes: | environ = _createenviron() given: | def _createenviron(): | ... # 27 line function `---- really be ,---- | | # Becomes: | environ = _environ given: | ... # 27 line function that defines _environ `---- What's the point of defining a function in the given clause if you only execute it once? You may just as well run the function code directly in the given clause. 2. In ,---- | # Current Python (early binding via default argument hack) | seq = [] | for i in range(10): | def f(_i=i): | return i | seq.append(f) | assert [f() for f in seq] == list(range(10)) `---- You probably meant "return _i"? 3. In my opinion the explicit early binding syntax is very ugly. Also, doesn't it restrict early binding to just one variable? 4. I think having given blocks default to early binding and other nested scopes not is very counterintuitive (but I don't have any idea of how to best resolve this). Best, -Nikolaus -- ?Time flies like an arrow, fruit flies like a Banana.? PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C From ron3200 at gmail.com Fri Oct 21 08:00:34 2011 From: ron3200 at gmail.com (Ron Adam) Date: Fri, 21 Oct 2011 01:00:34 -0500 Subject: [Python-ideas] An illustrative example for PEP 3150's statement local namespaces In-Reply-To: References: Message-ID: <1319176834.13586.36.camel@Gutsy> On Thu, 2011-10-20 at 14:37 +1000, Nick Coghlan wrote: > candidate_dirs = sorted(candidate_dirs, key=get_mtime) given: > candidate_dirs = next(filter_walk(base_dir, > dir_pattern=dir_filter, depth=0)).subdirs > candidate_dirs = (subdir for subdir in candidates if not any(d > in subdir for d in dirs_to_ignore)) > def get_mtime(path): > stat_path = os.path.join(base_dir, path) > return os.stat(stat_path).st_mtime > > Notice how the comment from the original version becomes redundant in > the second version? It's just repeating what the actual header line > right below it says, so I got rid of it. In the original version it > was necessary because there was no indentation in the code to indicate > that this was all just different stages of one internal calculation > leading up to that final step to create the sorted list. I think you are loosing me, I just don't see some of the things you've mentioned before in this. And I don't see any advantage in the second version over the first. It's fairly common to build up a result by a chain of results that modifies the result of the previous result, with a final result. And it's not too uncommon to reuse the same name as you go, so... If the given statement set the name of the ongoing calculation, and the suite following it was a series of expressions with no right hand side assignment. Your routine would look as follows. # Generate list of candidate directories # sorted by modification time. def get_mtime(path): stat_path = os.path.join(base_dir, path) return os.stat(stat_path).st_mtime candidate_dirs given: = next(filter_walk(base_dir, dir_pattern=dir_filter, depth=0)).subdirs = (subdir for subdir in candidate_dirs if not any(d in subdir for d in dirs_to_ignore)) = sorted(candidate_dirs, key=get_mtime) Each expression would give candidate_dir a new value and avoid duplicating the result name on each line. It also might be possible to generate optimized byte code as the result could stay on the stack until the block is exited. I could also see this used in a command window... >>> given result: ... = 10 ... += 20 ... -= 10 ... >>> result 25 This does put the item of interest up front like you want, and it also reduces repetition is calculations, but it's quite a different concept overall. Cheers, Ron From merwok at netwok.org Fri Oct 21 16:45:44 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Fri, 21 Oct 2011 16:45:44 +0200 Subject: [Python-ideas] Changing str(someclass) to return only the class name Message-ID: <4EA18598.9060602@netwok.org> Hi everyone, I?ve sometimes wished that str(someclass) were the natural way to get a class name, for example when writing __repr__ methods. Guido expressed the same wish recently, so I?ve made a patch for Python 3.3. In the whole standard library, only one module (optparse) and three test files (which were parsing the output of str(someclass) or using doctests) had to be updated because of this change: http://bugs.python.org/issue13224 One can say that this change would break code and should thus be rejected; one could argue that the previous behavior of str(someclass) was undefined, or an implementation detail. What say you? Cheers From steve at pearwood.info Fri Oct 21 19:45:20 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 22 Oct 2011 04:45:20 +1100 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: <4EA18598.9060602@netwok.org> References: <4EA18598.9060602@netwok.org> Message-ID: <4EA1AFB0.4080000@pearwood.info> ?ric Araujo wrote: > Hi everyone, > > I?ve sometimes wished that str(someclass) were the natural way to get a > class name, for example when writing __repr__ methods. someclass.__name__ seems much more natural to me. If you want the name of the class, ask for its name, not its string representation. Generally speaking, the string representation should tell you what sort of object it is (either explicitly or implicitly), not just its value. The same behaviour applies to functions and modules: >>> import math >>> str(math) "" >>> def spam(): ... pass ... >>> str(spam) '' rather than "math" and "spam" respectively. I would not like str(module) or str(function) to imitate string objects, and likewise for classes. They should continue to identify themselves as classes or types: >>> class K: ... pass ... >>> str(K) "" > One can say that this change would break code and should thus be > rejected; one could argue that the previous behavior of str(someclass) > was undefined, or an implementation detail. > > What say you? -1 -- Steven From ghostwriter402 at gmail.com Fri Oct 21 22:57:55 2011 From: ghostwriter402 at gmail.com (Spectral One) Date: Fri, 21 Oct 2011 15:57:55 -0500 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: <4EA1AFB0.4080000@pearwood.info> References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> Message-ID: <4EA1DCD3.2010807@gmail.com> > If you want the name of the class, ask for its name, not its string > representation. Generally speaking, the string representation should > tell you what sort of object it is (either explicitly or implicitly), > not just its value. I agree with this rather generally. I'd also like access to the class name, but I don't really like that implementation yet. A thought: Could this be reframed as a tweak to the reporting of type objects, specifically? If the __repr__ function of a type object returned the desired class information, it seems like it would solve this issue. Currently, print str(type(variable)) # returns a string in the form of #"'module reference and class name' type.__name__ at address" print repr(type(variable)) # returns the same string. print type(variable).__name__ # returns "instance" for any user variable instances. However, if we altered repr to return the qualified name, we'd be pretty much set, as we already have the __name__ variable in type objects* and the __str__ can remain untouched. That would make the information rather easily accessed and without special parsing. (*I'd prefer a short statement to extract that name, too. I could work with "print class(variable)" if the parser could. Other options off the top o' my noggin: "type.name(variable)" or "variable.__name__" where the .__name_ is automatically supplied. (allow overwriting in the class definition?)) -Nate From zuo at chopin.edu.pl Sat Oct 22 01:54:41 2011 From: zuo at chopin.edu.pl (Jan Kaliszewski) Date: Sat, 22 Oct 2011 01:54:41 +0200 Subject: [Python-ideas] Avoiding nested for try..finally: atexit for functions? In-Reply-To: <20111020151115.GC20970@pantoffel-wg.de> References: <87pqhtafrz.fsf@vostro.rath.org> <20111019040214.GA5524@flay.puzzling.org> <87lishccj3.fsf@inspiron.ap.columbia.edu> <87fwipca9k.fsf@inspiron.ap.columbia.edu> <20111020001922.GA2191@chopin.edu.pl> <20111020151115.GC20970@pantoffel-wg.de> Message-ID: <20111021235441.GA7138@chopin.edu.pl> Sven Marnach dixit (2011-10-20, 16:11): > Why introduce an error handler at all? If you want to handle > exceptions inside the with statement, an explicit try-except block > would be both more convenient (you don't need an extra function > definition) and more flexible (you can use - say - 'except > ValueError:' to just catch some exceptions). To give an example, I'd > really prefer > > with CleanupManager() as cleanup: > try: > # do stuff > except ValueError: > # handle exception You are right, that error handler is redundant. Except that, my code mimics the nested structure of try-finally clauses from the primary OP's example. Cheers. *j From zuo at chopin.edu.pl Sat Oct 22 02:22:58 2011 From: zuo at chopin.edu.pl (Jan Kaliszewski) Date: Sat, 22 Oct 2011 02:22:58 +0200 Subject: [Python-ideas] Avoiding nested for try..finally: atexit for functions? In-Reply-To: <87mxcw1k2p.fsf@vostro.rath.org> References: <87pqhtafrz.fsf@vostro.rath.org> <20111019040214.GA5524@flay.puzzling.org> <87lishccj3.fsf@inspiron.ap.columbia.edu> <87fwipca9k.fsf@inspiron.ap.columbia.edu> <20111020001922.GA2191@chopin.edu.pl> <87mxcw1k2p.fsf@vostro.rath.org> Message-ID: <20111022002258.GB7138@chopin.edu.pl> Nikolaus Rath dixit (2011-10-19, 22:19): > That said, do you have a suggestion for Python 2.7 as well? Maybe > something like a cleanup error handler? As Sven noted, the error handler proposed by me is redundant. Such ones would be redundant too, IMHO. The appropriate place to catch cleanup errors would be a cleanup callback itself. And -- according to another Sven's idea -- callbacks could be registered together with arguments, e.g.: def closing_callback(stream): try: stream.close() except Exception: log.critical('Hm, something wrong...', exc_info=True) with CleanupMagager() as cm: xfile = open(x) cm.register(closing_callback, xfile) ... An improved (and at the same time simplified) implementation (being also a recipe for Python 2.x, though this list is about ideas for Py3.x): class CleanupManager(object): def __init__(self, initial_callbacks=()): self.cleanup_callbacks = list(initial_callbacks) def register(self, callback, *args, **kwargs): self.cleanup_callbacks.append((callback, args, kwargs)) def __enter__(self): return self def __exit__(self, exc_type, exc, tb): self._next_callback() def _next_callback(self): if self.cleanup_callbacks: callback, args, kwargs = self.cleanup_callbacks.pop() try: callback(*args, **kwargs) finally: # all cleanup callbacks to be used # Py3.x: all errors to be reported self._next_callback() I hope it implements well what you explained... I'm not sure if it is worth to be added to the standard library (in the case of your primary example I'd rather prefer that try-finally nested structure) -- though in some cases it may become really useful: with CleanupMagager() as cm: ... cm.register(foo) ... if cond: cm.register(bar) else: cm.register(spam) for x in y: cm.register(baz, x) ... Cheers. *j From ncoghlan at gmail.com Sat Oct 22 02:39:33 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 22 Oct 2011 10:39:33 +1000 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: <4EA1AFB0.4080000@pearwood.info> References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> Message-ID: On Sat, Oct 22, 2011 at 3:45 AM, Steven D'Aprano wrote: > ?ric Araujo wrote: >> >> Hi everyone, >> >> I?ve sometimes wished that str(someclass) were the natural way to get a >> class name, for example when writing __repr__ methods. > > someclass.__name__ seems much more natural to me. > > If you want the name of the class, ask for its name, not its string > representation. Generally speaking, the string representation should tell > you what sort of object it is (either explicitly or implicitly), not just > its value. While that's an accurate description of the purpose of a "string representation", the function that serves that purpose is repr(), not str(). ?ric's patch doesn't touch type.__repr__, only type.__str__. str() is different - it's the one which is designed for general human consumption, and may omit some details if it makes sense. The use case that lead to the patch was actually string interpolation rather than direct invocation of str(). There's a lot of object representation code and error message formatting code that currently uses "obj.__class__.__name__" or "type(obj).__name__" to plug into a "{}" or "%s" placeholder in a format string. So I'd actually go the other way and suggest the following change to only return a subset of the information in the full representations for all 3 types: str(module) ==> module.__name__ str(func) ==> func.__name__ str(cls) ==> "{}.{}".format(cls.__module__, cls.__name__) # See note below Note: nested classes will give misleading information, but they already do that in their repr() implementations: >>> def f(): ... class C: pass ... return C ... >>> x = f() >>> x Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From zuo at chopin.edu.pl Sat Oct 22 03:32:02 2011 From: zuo at chopin.edu.pl (Jan Kaliszewski) Date: Sat, 22 Oct 2011 03:32:02 +0200 Subject: [Python-ideas] raise EXC from None (issue #6210) Message-ID: <20111022013202.GA8878@chopin.edu.pl> Hello. Some time ago I encountered the problem described in PEP 3134 as "Open Issue: Suppressing Context" ("this PEP makes it impossible to suppress '__context__', since setting exc.__context__ to None in an 'except' or 'finally' clause will only result in it being set again when exc is raised."). An idea that then appeared in my brain was: raise SomeException(some, arguments) from None ...and I see the same idea has been proposed by Patrick Westerhoff here: http://bugs.python.org/issue6210 I am +10, as I feel that's intuitive, elegant and status-quo-consistent. And what do you think? Cheers. *j From tjreedy at udel.edu Sat Oct 22 06:25:27 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 22 Oct 2011 00:25:27 -0400 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> Message-ID: On 10/21/2011 8:39 PM, Nick Coghlan wrote: > str(module) ==> module.__name__ > str(func) ==> func.__name__ > str(cls) ==> "{}.{}".format(cls.__module__, cls.__name__) # See note below If you do this, then also do str(generator) ==> generator.__name__ -- Terry Jan Reedy From guido at python.org Sat Oct 22 06:35:41 2011 From: guido at python.org (Guido van Rossum) Date: Fri, 21 Oct 2011 21:35:41 -0700 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> Message-ID: On Fri, Oct 21, 2011 at 9:25 PM, Terry Reedy wrote: > On 10/21/2011 8:39 PM, Nick Coghlan wrote: > >> ? ?str(module) ==> ?module.__name__ >> ? ?str(func) ?==> ?func.__name__ >> ? ?str(cls) ?==> ?"{}.{}".format(cls.__module__, cls.__name__) # See note >> below > > If you do this, then also do > str(generator) ==> generator.__name__ Why? Assuming by "generator" you mean the iteror returned by calling a generator function (not the generator function itself, which is covered by str(func) ==> func.__name__), the generator has no "name" which can be used to retrieve it. The three above (module, function, class) all -- typically -- are referenced by variables whose name is forced by the syntax (import foo, def foo, class foo). But that does not apply to a generator. -- --Guido van Rossum (python.org/~guido) From steve at pearwood.info Sat Oct 22 08:46:25 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 22 Oct 2011 17:46:25 +1100 Subject: [Python-ideas] raise EXC from None (issue #6210) In-Reply-To: <20111022013202.GA8878@chopin.edu.pl> References: <20111022013202.GA8878@chopin.edu.pl> Message-ID: <4EA266C1.3090201@pearwood.info> Jan Kaliszewski wrote: > Hello. > > Some time ago I encountered the problem described in PEP 3134 as "Open > Issue: Suppressing Context" ("this PEP makes it impossible to suppress > '__context__', since setting exc.__context__ to None in an 'except' or > 'finally' clause will only result in it being set again when exc is > raised."). > > An idea that then appeared in my brain was: > > raise SomeException(some, arguments) from None > > ...and I see the same idea has been proposed by Patrick Westerhoff here: > http://bugs.python.org/issue6210 I think that stating the syntax is the easy part, actually implementing may not be so simple. > I am +10, as I feel that's intuitive, elegant and status-quo-consistent. > > And what do you think? I think that it's well past time to fix this wart with the new nested exceptions functionality. +1 (Actually I'm also +10 but I don't want to start vote inflation.) The wart I'm referring to is that the common idiom of catching one exception and raising another is now treated as a bug in the except clause even when it isn't: >>> def mean(data): ... try: ... return sum(data)/len(data) ... except ZeroDivisionError: ... raise ValueError('data must be non-empty') ... >>> mean([]) Traceback (most recent call last): File "", line 3, in mean ZeroDivisionError: int division or modulo by zero During handling of the above exception, another exception occurred: Traceback (most recent call last): File "", line 1, in File "", line 5, in mean ValueError: data must be non-empty In this case, there is absolutely no reason to expose the fact that ZeroDivisionError occurred. That's an implementation detail which is irrelevant to the caller. But exposing it complicates the traceback for no benefit (since it isn't a bug that needs fixing) and possibly some harm (by causing some confusion to the reader). It also reflects badly on your code: it gives the impression of an unhandled bug when it is not. In my opinion, although the error context functionality is a big positive when debugging actual bugs in except clauses, the feature should never have been added without a way to suppress the error context. -- Steven From steve at pearwood.info Sat Oct 22 09:32:25 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 22 Oct 2011 18:32:25 +1100 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> Message-ID: <4EA27189.8010002@pearwood.info> Nick Coghlan wrote: > On Sat, Oct 22, 2011 at 3:45 AM, Steven D'Aprano wrote: >> ?ric Araujo wrote: >>> Hi everyone, >>> >>> I?ve sometimes wished that str(someclass) were the natural way to get a >>> class name, for example when writing __repr__ methods. >> someclass.__name__ seems much more natural to me. >> >> If you want the name of the class, ask for its name, not its string >> representation. Generally speaking, the string representation should tell >> you what sort of object it is (either explicitly or implicitly), not just >> its value. > > While that's an accurate description of the purpose of a "string > representation", the function that serves that purpose is repr(), not > str(). ?ric's patch doesn't touch type.__repr__, only type.__str__. > > str() is different - it's the one which is designed for general human > consumption, and may omit some details if it makes sense. Yes, I understand the difference between repr and str and the guidelines for them both. My argument is that for human consumption, str(MyClass) should continue to return something like "" and not just "MyClass". > The use case that lead to the patch was actually string interpolation > rather than direct invocation of str(). There's a lot of object > representation code and error message formatting code that currently > uses "obj.__class__.__name__" or "type(obj).__name__" to plug into a > "{}" or "%s" placeholder in a format string. Regardless of where or how you are using the name, if you explicitly want the name, you should ask for it explicitly rather than implicitly. Remember also that print(spam) will use str(spam). If you do this: >>> print(spam) something what would you expect spam is? My bet is that you expect it to be the string "something". With your suggestion, it could be either the string "something", a function named something, a module named something, or a class named "something" (if one uses __name__ rather than module.class name). I don't consider this helpful. I am aware that the string representation (using either __str__ or __repr__) of an object is not the definitive word in what the object really is. Using just print, one can't distinguish between a class and a string "", or for that matter between the string 2 and the int 2, and that arbitrary objects can return arbitrary strings. Nevertheless, I like the __str__ of classes, modules and functions just the way they are. -- Steven From ncoghlan at gmail.com Sat Oct 22 17:26:04 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 23 Oct 2011 01:26:04 +1000 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: <4EA27189.8010002@pearwood.info> References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> Message-ID: I don't think this line of argument is valid. Printing 1 and "1" produces the same output, that's why repr() exists. If someone wants debugging level detail, use repr(), just like the interactive interpreter does. -- Nick Coghlan (via Gmail on Android, so likely to be more terse than usual) On Oct 22, 2011 5:33 PM, "Steven D'Aprano" wrote: > Nick Coghlan wrote: > >> On Sat, Oct 22, 2011 at 3:45 AM, Steven D'Aprano >> wrote: >> >>> ?ric Araujo wrote: >>> >>>> Hi everyone, >>>> >>>> I?ve sometimes wished that str(someclass) were the natural way to get a >>>> class name, for example when writing __repr__ methods. >>>> >>> someclass.__name__ seems much more natural to me. >>> >>> If you want the name of the class, ask for its name, not its string >>> representation. Generally speaking, the string representation should tell >>> you what sort of object it is (either explicitly or implicitly), not just >>> its value. >>> >> >> While that's an accurate description of the purpose of a "string >> representation", the function that serves that purpose is repr(), not >> str(). ?ric's patch doesn't touch type.__repr__, only type.__str__. >> >> str() is different - it's the one which is designed for general human >> consumption, and may omit some details if it makes sense. >> > > Yes, I understand the difference between repr and str and the guidelines > for them both. My argument is that for human consumption, str(MyClass) > should continue to return something like "" and not just > "MyClass". > > > The use case that lead to the patch was actually string interpolation >> rather than direct invocation of str(). There's a lot of object >> representation code and error message formatting code that currently >> uses "obj.__class__.__name__" or "type(obj).__name__" to plug into a >> "{}" or "%s" placeholder in a format string. >> > > Regardless of where or how you are using the name, if you explicitly want > the name, you should ask for it explicitly rather than implicitly. > > Remember also that print(spam) will use str(spam). If you do this: > > >>> print(spam) > something > > > what would you expect spam is? My bet is that you expect it to be the > string "something". With your suggestion, it could be either the string > "something", a function named something, a module named something, or a > class named "something" (if one uses __name__ rather than module.class > name). I don't consider this helpful. > > I am aware that the string representation (using either __str__ or > __repr__) of an object is not the definitive word in what the object really > is. Using just print, one can't distinguish between a class and a string > "", or for that matter between the string 2 and the > int 2, and that arbitrary objects can return arbitrary strings. > Nevertheless, I like the __str__ of classes, modules and functions just the > way they are. > > > > -- > Steven > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Oct 22 17:20:31 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 23 Oct 2011 01:20:31 +1000 Subject: [Python-ideas] raise EXC from None (issue #6210) In-Reply-To: <4EA266C1.3090201@pearwood.info> References: <20111022013202.GA8878@chopin.edu.pl> <4EA266C1.3090201@pearwood.info> Message-ID: The class method approach I describe in the linked issue is more likely to be accepted - this problem shouldn't need new syntax to resolve. -- Nick Coghlan (via Gmail on Android, so likely to be more terse than usual) On Oct 22, 2011 4:47 PM, "Steven D'Aprano" wrote: > Jan Kaliszewski wrote: > >> Hello. >> >> Some time ago I encountered the problem described in PEP 3134 as "Open >> Issue: Suppressing Context" ("this PEP makes it impossible to suppress >> '__context__', since setting exc.__context__ to None in an 'except' or >> 'finally' clause will only result in it being set again when exc is >> raised."). >> >> An idea that then appeared in my brain was: >> >> raise SomeException(some, arguments) from None >> >> ...and I see the same idea has been proposed by Patrick Westerhoff here: >> http://bugs.python.org/**issue6210 >> > > > I think that stating the syntax is the easy part, actually implementing may > not be so simple. > > > I am +10, as I feel that's intuitive, elegant and status-quo-consistent. >> >> And what do you think? >> > > I think that it's well past time to fix this wart with the new nested > exceptions functionality. > > +1 > > (Actually I'm also +10 but I don't want to start vote inflation.) > > The wart I'm referring to is that the common idiom of catching one > exception and raising another is now treated as a bug in the except clause > even when it isn't: > > >>> def mean(data): > ... try: > ... return sum(data)/len(data) > ... except ZeroDivisionError: > ... raise ValueError('data must be non-empty') > ... > >>> mean([]) > Traceback (most recent call last): > File "", line 3, in mean > ZeroDivisionError: int division or modulo by zero > > During handling of the above exception, another exception occurred: > > Traceback (most recent call last): > File "", line 1, in > File "", line 5, in mean > ValueError: data must be non-empty > > > In this case, there is absolutely no reason to expose the fact that > ZeroDivisionError occurred. That's an implementation detail which is > irrelevant to the caller. But exposing it complicates the traceback for no > benefit (since it isn't a bug that needs fixing) and possibly some harm (by > causing some confusion to the reader). > > It also reflects badly on your code: it gives the impression of an > unhandled bug when it is not. > > In my opinion, although the error context functionality is a big positive > when debugging actual bugs in except clauses, the feature should never have > been added without a way to suppress the error context. > > > -- > Steven > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Oct 22 22:18:15 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 23 Oct 2011 07:18:15 +1100 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> Message-ID: <4EA32507.7010900@pearwood.info> Nick Coghlan wrote: > If someone wants debugging level detail, use repr(), just like the > interactive interpreter does. I'm just going to repeat what I've said before: explicit is better than implicit. If you want the name of an object (be it a class, a module, a function, or something else), you should explicitly ask for the name, and not rely on its str(). The details returned by str() are, in some sense, arbitrary. The docs describe it as [quote] the ?informal? string representation of an object [end quote]. http://docs.python.org/reference/datamodel.html#object.__str__ On that basis, objects are free to return as much, or as little, information as makes sense in their str(). (As you pointed out earlier.) However, the docs also say that str() should return [quote] a string containing a nicely printable representation of an object [end quote]. http://docs.python.org/library/functions.html#str To my mind, the name alone of a class (or function or module) is in no sense a nicely printable representation of the object. I would argue strongly that the property of being "nicely representable" outweighs by far the convenience of avoiding 9 extra characters in one specific use-case: "blah blah blah class '%s'" % cls # instead of cls.__name__ But for the sake of the argument, I'll grant you that we're free to change str(cls) to return the class name, as requested by the OP, or the fully qualified module.class dotted name as suggested by you. So let's suppose that, after a long and bitter debate over which colour to paint this bikeshed, you win the debate. But this doesn't help you at all, because you can't rely on it. It seems to me that the exact format of str(cls) is an implementation detail. You can't rely on other Pythons to do the same thing, nor can you expect a guarantee that str(cls) won't change again in the future. So if you care about the exact string that gets generated, you still have to explicitly use cls.__name__ just as you do now. The __name__ attribute is part of the guaranteed API of class objects (and also functions and modules), the output of str(cls) is not. In my opinion relying on it to return a particular output is dangerous, regardless of whether the output is "", "module.MyClass", "MyClass" or something else. Having str(cls) return just the class name (or the module.class dotted name) is an attractive nuisance that should be resisted. -- Steven From guido at python.org Sat Oct 22 22:32:59 2011 From: guido at python.org (Guido van Rossum) Date: Sat, 22 Oct 2011 13:32:59 -0700 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: <4EA32507.7010900@pearwood.info> References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> Message-ID: On Sat, Oct 22, 2011 at 1:18 PM, Steven D'Aprano wrote: > Nick Coghlan wrote: > >> If someone wants debugging level detail, use repr(), just like the >> interactive interpreter does. > > I'm just going to repeat what I've said before: explicit is better than > implicit. If you want the name of an object (be it a class, a module, a > function, or something else), you should explicitly ask for the name, and > not rely on its str(). > > The details returned by str() are, in some sense, arbitrary. The docs > describe it as [quote] the ?informal? string representation of an object > [end quote]. > > http://docs.python.org/reference/datamodel.html#object.__str__ > > On that basis, objects are free to return as much, or as little, information > as makes sense in their str(). (As you pointed out earlier.) > > However, the docs also say that str() should return [quote] a string > containing a nicely printable representation of an object [end quote]. > > http://docs.python.org/library/functions.html#str > > To my mind, the name alone of a class (or function or module) is in no sense > a nicely printable representation of the object. I would argue strongly that > the property of being "nicely representable" outweighs by far the > convenience of avoiding 9 extra characters in one specific use-case: > > "blah blah blah class '%s'" % cls ?# instead of cls.__name__ > > > But for the sake of the argument, I'll grant you that we're free to change > str(cls) to return the class name, as requested by the OP, or the fully > qualified module.class dotted name as suggested by you. So let's suppose > that, after a long and bitter debate over which colour to paint this > bikeshed, you win the debate. > > But this doesn't help you at all, because you can't rely on it. It seems to > me that the exact format of str(cls) is an implementation detail. You can't > rely on other Pythons to do the same thing, nor can you expect a guarantee > that str(cls) won't change again in the future. So if you care about the > exact string that gets generated, you still have to explicitly use > cls.__name__ just as you do now. > > The __name__ attribute is part of the guaranteed API of class objects (and > also functions and modules), the output of str(cls) is not. In my opinion > relying on it to return a particular output is dangerous, regardless of > whether the output is "", "module.MyClass", > "MyClass" or something else. > > Having str(cls) return just the class name (or the module.class dotted name) > is an attractive nuisance that should be resisted. Thinking of str(x) as an API to get a certain value would lead there, yes. But thinking of str(x) as what gets printed by print(x), formatted by "{}.format(x)", and "%s" % s, changes things. When I am printing an object and I have no idea what type it is, I'll use repr() or "%r"; but when I know I am printing, say, an exception, I think it would be very nice if print(x) would just print its name. Just like print(None) prints 'None', it would make all the sense in the world if print(ZeroDivisionError) printed 'ZeroDivisionError', and print(type(42)) printed 'int'. -- --Guido van Rossum (python.org/~guido) From g.brandl at gmx.net Sat Oct 22 23:27:39 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 22 Oct 2011 23:27:39 +0200 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> Message-ID: On 10/22/11 22:32, Guido van Rossum wrote: > Thinking of str(x) as an API to get a certain value would lead there, > yes. But thinking of str(x) as what gets printed by print(x), > formatted by "{}.format(x)", and "%s" % s, changes things. When I am > printing an object and I have no idea what type it is, I'll use repr() > or "%r"; but when I know I am printing, say, an exception, I think it > would be very nice if print(x) would just print its name. Just like > print(None) prints 'None', it would make all the sense in the world if > print(ZeroDivisionError) printed 'ZeroDivisionError', and > print(type(42)) printed 'int'. +1. Georg From solipsis at pitrou.net Sat Oct 22 23:33:49 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 22 Oct 2011 23:33:49 +0200 Subject: [Python-ideas] Changing str(someclass) to return only the class name References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> Message-ID: <20111022233349.586e9048@pitrou.net> On Sat, 22 Oct 2011 18:32:25 +1100 Steven D'Aprano wrote: > Remember also that print(spam) will use str(spam). If you do this: > > >>> print(spam) > something > > > what would you expect spam is? What if: >>> print(spam) 1 It might be the integer 1 or the string "1". You need repr() to tell the difference. Regards Antoine. From arnodel at gmail.com Sat Oct 22 23:43:31 2011 From: arnodel at gmail.com (Arnaud Delobelle) Date: Sat, 22 Oct 2011 22:43:31 +0100 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: <20111022233349.586e9048@pitrou.net> References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <20111022233349.586e9048@pitrou.net> Message-ID: On 22 October 2011 22:33, Antoine Pitrou wrote: > On Sat, 22 Oct 2011 18:32:25 +1100 > Steven D'Aprano wrote: >> Remember also that print(spam) will use str(spam). If you do this: >> >> ?>>> print(spam) >> something >> >> >> what would you expect spam is? > > What if: > >>>> print(spam) > 1 > > It might be the integer 1 or the string "1". You need repr() to tell > the difference. Indeed: >>> print(A) >>> A "" -- Arnaud From ben+python at benfinney.id.au Sat Oct 22 23:44:24 2011 From: ben+python at benfinney.id.au (Ben Finney) Date: Sun, 23 Oct 2011 08:44:24 +1100 Subject: [Python-ideas] Changing str(someclass) to return only the class name References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> Message-ID: <8739ekpupz.fsf@benfinney.id.au> Guido van Rossum writes: > On Sat, Oct 22, 2011 at 1:18 PM, Steven D'Aprano wrote: > > I'm just going to repeat what I've said before: explicit is better > > than implicit. If you want the name of an object (be it a class, a > > module, a function, or something else), you should explicitly ask > > for the name, and not rely on its str(). +1. Many objects, specifically including exceptions, have something more useful than the name to return from ?str(x)?. > > However, the docs also say that str() should return [quote] a string > > containing a nicely printable representation of an object [end > > quote]. Yes. The name of an object (if it has one) is often just one part of the representation of an object. > > Having str(cls) return just the class name (or the module.class > > dotted name) is an attractive nuisance that should be resisted. Agreed. > When I am printing an object and I have no idea what type it is, I'll > use repr() or "%r"; but when I know I am printing, say, an exception, > I think it would be very nice if print(x) would just print its name. ?1. That makes the string representation of an exception much less useful. Exceptions don't have names; each exception *type* has a name, but that doesn't distinguish instances of the type from one another. When there is an ?IOError? it's far more useful that the string representation contains the exception *message*, since the name of the exception type doesn't tell the user much about what went wrong. > Just like print(None) prints 'None', it would make all the sense in > the world if print(ZeroDivisionError) printed 'ZeroDivisionError' Those examples are objects which are effectively identical in each instance. (In the case of None, there is only one instance). Those are the unusual case; the common case is that objects of a given type are different form each other, and that frequently means their printable representation should be different too. If the instances have a name (e.g. function objects), I agree the name should be *included in* the string representation. But there's usually other useful information that should also be included. -- \ ?Ignorance more frequently begets confidence than does | `\ knowledge.? ?Charles Darwin, _The Descent of Man_, 1871 | _o__) | Ben Finney From tjreedy at udel.edu Sun Oct 23 00:49:02 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 22 Oct 2011 18:49:02 -0400 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> Message-ID: On 10/22/2011 12:35 AM, Guido van Rossum wrote: > On Fri, Oct 21, 2011 at 9:25 PM, Terry Reedy wrote: >> On 10/21/2011 8:39 PM, Nick Coghlan wrote: >> >>> str(module) ==> module.__name__ >>> str(func) ==> func.__name__ >>> str(cls) ==> "{}.{}".format(cls.__module__, cls.__name__) # See note >>> below >> >> If you do this, then also do >> str(generator) ==> generator.__name__ > > Why? For printing messages (which I currently do with failing generators), as you explained in another post: "But thinking of str(x) as what gets printed by print(x), formatted by "{}.format(x)", and "%s" % s, changes things. When I am printing an object and I have no idea what type it is, I'll use repr() or "%r"; but when I know I am printing, say, an exception, I think it would be very nice if print(x) would just print its name." > Assuming by "generator" you mean the iteror returned by calling a > generator function (not the generator function itself, which is > covered by str(func) ==> func.__name__), the generator has no "name" > which can be used to retrieve it. The three above (module, function, > class) all -- typically -- are referenced by variables whose name is > forced by the syntax (import foo, def foo, class foo). But that does > not apply to a generator. Not relevant unless you want to do something like globals()[str(obj)] or getattr(namespace, str(obj)). The .__name__ attribute of generators is just as much forced by syntax as for the others, as it is a copy of the parent generator function. I am currently +0, as I do not mind occasionally typing .__name__, although I believe it *is* the only exposed __xxx__ name (outside of class statements) in my current code. -- Terry Jan Reedy From ron3200 at gmail.com Sun Oct 23 01:11:11 2011 From: ron3200 at gmail.com (Ron Adam) Date: Sat, 22 Oct 2011 18:11:11 -0500 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> Message-ID: <1319325071.7356.15.camel@Gutsy> On Sat, 2011-10-22 at 13:32 -0700, Guido van Rossum wrote: > Thinking of str(x) as an API to get a certain value would lead there, > yes. But thinking of str(x) as what gets printed by print(x), > formatted by "{}.format(x)", and "%s" % s, changes things. When I am > printing an object and I have no idea what type it is, I'll use repr() > or "%r"; but when I know I am printing, say, an exception, I think it > would be very nice if print(x) would just print its name. Just like > print(None) prints 'None', it would make all the sense in the world if > print(ZeroDivisionError) printed 'ZeroDivisionError', and > print(type(42)) printed 'int'. I like the part where you say... "But thinking of str(x) as what gets printed by print(x)" Which means (to me) that it should be what ever makes the most sense for that particular type or object. For some things, it makes sense to return __name__, while for other things, it makes sense to return something else. There isn't a problem unless we are trying to apply a rule to *everything*. Also it should be pointed out that not all objects have a __name__ attribute. Cheers, Ron From ncoghlan at gmail.com Sun Oct 23 03:56:55 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 23 Oct 2011 11:56:55 +1000 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: <8739ekpupz.fsf@benfinney.id.au> References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> <8739ekpupz.fsf@benfinney.id.au> Message-ID: On Sun, Oct 23, 2011 at 7:44 AM, Ben Finney wrote: > Those examples are objects which are effectively identical in each > instance. (In the case of None, there is only one instance). Those are > the unusual case; the common case is that objects of a given type are > different form each other, and that frequently means their printable > representation should be different too. > > If the instances have a name (e.g. function objects), I agree the name > should be *included in* the string representation. But there's usually > other useful information that should also be included. Until we fix the qualified name problem for nested functions and classes, there isn't really other information of interest to be included (and I'm going back on some I said earlier here). While we *could* include __module__ (especially since class repr's already include it), it's sometimes actively misleading to do so. By only including __name__, we deliberately underqualify *everything*, so "str(cls)" and "str(func)" just refers to the name they were defined with and omits any additional context. The current convention is that classes, functions and modules, don't offer a shorthand "pretty" display format at all. The proposal is to specifically bless "x.__name__" as an official shorthand. Sure, sometimes you won't *want* the shorthand, so you'll need to either explicitly ask for repr(), or otherwise construct your own description from individual attributes. That's true for a lot of other types as well (especially when it comes to numbers). I think the most valid objection raised so far is the fact that some metaclasses *already* override __str__ to display something other than the result of type.__repr__(cls). That does have the potential to cause problems for code dealing with utterly unknown types. However, such code should likely be explicitly invoking repr() rather than str() anyway, so this change is unlikely to break anything that isn't already broken. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From guido at python.org Sun Oct 23 04:28:38 2011 From: guido at python.org (Guido van Rossum) Date: Sat, 22 Oct 2011 19:28:38 -0700 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: <8739ekpupz.fsf@benfinney.id.au> References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> <8739ekpupz.fsf@benfinney.id.au> Message-ID: On Sat, Oct 22, 2011 at 2:44 PM, Ben Finney wrote: > Guido van Rossum writes: > >> On Sat, Oct 22, 2011 at 1:18 PM, Steven D'Aprano wrote: > >> > I'm just going to repeat what I've said before: explicit is better >> > than implicit. If you want the name of an object (be it a class, a >> > module, a function, or something else), you should explicitly ask >> > for the name, and not rely on its str(). > > +1. Many objects, specifically including exceptions, have something more > useful than the name to return from 'str(x)'. > >> > However, the docs also say that str() should return [quote] a string >> > containing a nicely printable representation of an object [end >> > quote]. > > Yes. The name of an object (if it has one) is often just one part of the > representation of an object. > >> > Having str(cls) return just the class name (or the module.class >> > dotted name) is an attractive nuisance that should be resisted. > > Agreed. > >> When I am printing an object and I have no idea what type it is, I'll >> use repr() or "%r"; but when I know I am printing, say, an exception, >> I think it would be very nice if print(x) would just print its name. > > -1. That makes the string representation of an exception much less > useful. > > Exceptions don't have names; each exception *type* has a name, but that > doesn't distinguish instances of the type from one another. When there > is an 'IOError' it's far more useful that the string representation > contains the exception *message*, since the name of the exception type > doesn't tell the user much about what went wrong. You misunderstood me. I'm not proposing to change the str() of an exception *instance*. That will continue to be the message (while its repr() is a <...> style thing). I'm only proposing to change the str() of an exception *class* -- or any other class, for that matter. >> Just like print(None) prints 'None', it would make all the sense in >> the world if print(ZeroDivisionError) printed 'ZeroDivisionError' > > Those examples are objects which are effectively identical in each > instance. (In the case of None, there is only one instance). Those are > the unusual case; the common case is that objects of a given type are > different form each other, and that frequently means their printable > representation should be different too. Right, and the built-in types and exceptions are in the same category. > If the instances have a name (e.g. function objects), I agree the name > should be *included in* the string representation. But there's usually > other useful information that should also be included. For that you can use repr(). -- --Guido van Rossum (python.org/~guido) From guido at python.org Sun Oct 23 04:33:44 2011 From: guido at python.org (Guido van Rossum) Date: Sat, 22 Oct 2011 19:33:44 -0700 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: <1319325071.7356.15.camel@Gutsy> References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> <1319325071.7356.15.camel@Gutsy> Message-ID: On Sat, Oct 22, 2011 at 4:11 PM, Ron Adam wrote: > On Sat, 2011-10-22 at 13:32 -0700, Guido van Rossum wrote: >> Thinking of str(x) as an API to get a certain value would lead there, >> yes. But thinking of str(x) as what gets printed by print(x), >> formatted by "{}.format(x)", and "%s" % s, changes things. When I am >> printing an object and I have no idea what type it is, I'll use repr() >> or "%r"; but when I know I am printing, say, an exception, I think it >> would be very nice if print(x) would just print its name. Just like >> print(None) prints 'None', it would make all the sense in the world if >> print(ZeroDivisionError) printed 'ZeroDivisionError', and >> print(type(42)) printed 'int'. > > I like the part where you say... > > "But thinking of str(x) as what gets printed by print(x)" > > Which means (to me) that it should be what ever makes the most sense for > that particular type or object. ?For some things, it makes sense to > return __name__, while for other things, it makes sense to return > something else. > > There isn't a problem unless we are trying to apply a rule to > *everything*. > > Also it should be pointed out that not all objects have a __name__ > attribute. Correct, and that's why the proposal is strictly limited to changing str() for classes, functions and modules. (And I have to agree with Nick that it's best to just return __name__ in all three cases, rather than trying to be clever and use the qualified name.) -- --Guido van Rossum (python.org/~guido) From guido at python.org Sun Oct 23 04:40:55 2011 From: guido at python.org (Guido van Rossum) Date: Sat, 22 Oct 2011 19:40:55 -0700 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> Message-ID: On Sat, Oct 22, 2011 at 3:49 PM, Terry Reedy wrote: > On 10/22/2011 12:35 AM, Guido van Rossum wrote: >> >> On Fri, Oct 21, 2011 at 9:25 PM, Terry Reedy ?wrote: >>> >>> On 10/21/2011 8:39 PM, Nick Coghlan wrote: >>> >>>> ? ?str(module) ==> ? ?module.__name__ >>>> ? ?str(func) ?==> ? ?func.__name__ >>>> ? ?str(cls) ?==> ? ?"{}.{}".format(cls.__module__, cls.__name__) # See >>>> note >>>> below >>> >>> If you do this, then also do >>> str(generator) ==> ?generator.__name__ >> >> Why? > > For printing messages (which I currently do with failing generators), as you > explained in another post: > > "But thinking of str(x) as what gets printed by print(x), > formatted by "{}.format(x)", and "%s" % s, changes things. When I am > printing an object and I have no idea what type it is, I'll use repr() > or "%r"; but when I know I am printing, say, an exception, I think it > would be very nice if print(x) would just print its name." > >> Assuming by "generator" you mean the iteror returned by calling a >> generator function (not the generator function itself, which is >> covered by str(func) ==> ?func.__name__), the generator has no "name" >> which can be used to retrieve it. The three above (module, function, >> class) all -- typically -- are referenced by variables whose name is >> forced by the syntax (import foo, def foo, class foo). But that does >> not apply to a generator. > > Not relevant unless you want to do something like globals()[str(obj)] or > getattr(namespace, str(obj)). I am totally not following you, but possibly we just are in violent agreement? I don't want to change the str() of a generator object. My reasoning (and this is just intuition) is that printing just the name of the function for a generator object suppresses *too* much information; it would be like printing just the exception (class) name for exception instances. My proposal is not to make str(x) return x.__name__ for everything that happens to have a __name__ attribute (e.g. bound methods also have one). My proposal is to make str(x) return x.__name__ for exactly these three types of objects: modules, classes, and functions. > The .__name__ attribute of generators is just as much forced by syntax as > for the others, as it is a copy of the parent generator function. > > I am currently +0, as I do not mind occasionally typing .__name__, although > I believe it *is* the only exposed __xxx__ name (outside of class > statements) in my current code. -- --Guido van Rossum (python.org/~guido) From merwok at netwok.org Sun Oct 23 04:47:18 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Sun, 23 Oct 2011 04:47:18 +0200 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> <1319325071.7356.15.camel@Gutsy> Message-ID: <4EA38036.9060706@netwok.org> Le 23/10/2011 04:33, Guido van Rossum a ?crit : > (And I have to agree with Nick that it's best to just return __name__ > in all three cases, rather than trying to be clever and use the > qualified name.) My first reaction was that returning 'module.name' was more helpful and saved more typing than just 'name', but I don?t have any real arguments. I will update the patch on http://bugs.python.org/issue13224 to change the repr of functions and modules. People concerned about breakage will be able to compile a patched Python to test their code. Cheers From anacrolix at gmail.com Sun Oct 23 04:52:56 2011 From: anacrolix at gmail.com (Matt Joiner) Date: Sun, 23 Oct 2011 13:52:56 +1100 Subject: [Python-ideas] raise EXC from None (issue #6210) In-Reply-To: References: <20111022013202.GA8878@chopin.edu.pl> <4EA266C1.3090201@pearwood.info> Message-ID: I quite liked Matthew Barnett's suggestion: try: x / y except ZeroDivisionError as e: raise as Exception( 'Invalid value for y' ) The rationale is that it's saying "forget about the original exception (if any), raise _as though_ this is the original exception". But how common is this desire to suppress exception context? Furthermore, Nick's suggestion about raising a no-context exception via a class method is very unsurprising behaviour. Encountering: raise ValueError.no_context("can't process empty iterable") is pretty explicit. Also an additional advantage of using the class method is that it *can* be made part of the syntax later if it's clear that it is common enough. +1 to both these suggestions. On Sun, Oct 23, 2011 at 2:20 AM, Nick Coghlan wrote: > The class method approach I describe in the linked issue is more likely to > be accepted - this problem shouldn't need new syntax to resolve. > > -- > Nick Coghlan (via Gmail on Android, so likely to be more terse than usual) > > On Oct 22, 2011 4:47 PM, "Steven D'Aprano" wrote: >> >> Jan Kaliszewski wrote: >>> >>> Hello. >>> >>> Some time ago I encountered the problem described in PEP 3134 as "Open >>> Issue: Suppressing Context" ("this PEP makes it impossible to suppress >>> '__context__', since setting exc.__context__ to None in an 'except' or >>> 'finally' clause will only result in it being set again when exc is >>> raised."). >>> >>> An idea that then appeared in my brain was: >>> >>> ? ?raise SomeException(some, arguments) from None >>> >>> ...and I see the same idea has been proposed by Patrick Westerhoff here: >>> http://bugs.python.org/issue6210 >> >> >> I think that stating the syntax is the easy part, actually implementing >> may not be so simple. >> >> >>> I am +10, as I feel that's intuitive, elegant and status-quo-consistent. >>> >>> And what do you think? >> >> I think that it's well past time to fix this wart with the new nested >> exceptions functionality. >> >> +1 >> >> (Actually I'm also +10 but I don't want to start vote inflation.) >> >> The wart I'm referring to is that the common idiom of catching one >> exception and raising another is now treated as a bug in the except clause >> even when it isn't: >> >> >>> def mean(data): >> ... ? ? try: >> ... ? ? ? ? ? ? return sum(data)/len(data) >> ... ? ? except ZeroDivisionError: >> ... ? ? ? ? ? ? raise ValueError('data must be non-empty') >> ... >> >>> mean([]) >> Traceback (most recent call last): >> ?File "", line 3, in mean >> ZeroDivisionError: int division or modulo by zero >> >> During handling of the above exception, another exception occurred: >> >> Traceback (most recent call last): >> ?File "", line 1, in >> ?File "", line 5, in mean >> ValueError: data must be non-empty >> >> >> In this case, there is absolutely no reason to expose the fact that >> ZeroDivisionError occurred. That's an implementation detail which is >> irrelevant to the caller. But exposing it complicates the traceback for no >> benefit (since it isn't a bug that needs fixing) and possibly some harm (by >> causing some confusion to the reader). >> >> It also reflects badly on your code: it gives the impression of an >> unhandled bug when it is not. >> >> In my opinion, although the error context functionality is a big positive >> when debugging actual bugs in except clauses, the feature should never have >> been added without a way to suppress the error context. >> >> >> -- >> Steven >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > From steve at pearwood.info Sun Oct 23 06:26:16 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 23 Oct 2011 15:26:16 +1100 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: <20111022233349.586e9048@pitrou.net> References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <20111022233349.586e9048@pitrou.net> Message-ID: <4EA39768.6080301@pearwood.info> Antoine Pitrou wrote: [...] > It might be the integer 1 or the string "1". You need repr() to tell > the difference. Antoine, that's a cheap shot. The *very next paragraph* of my post which you replied to, and which you snipped out of your response, says: I am aware that the string representation (using either __str__ or __repr__) of an object is not the definitive word in what the object really is. Using just print, one can't distinguish between a class and a string "", or for that matter between the string 2 and the int 2, and that arbitrary objects can return arbitrary strings. Nevertheless, I like the __str__ of classes, modules and functions just the way they are. Can we please stop throwing up this red herring? Uniqueness of the str() or repr() of an object is not and has never been a requirement, which is fortunate since it is physically impossible. -- Steven From steve at pearwood.info Sun Oct 23 07:23:34 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 23 Oct 2011 16:23:34 +1100 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> <8739ekpupz.fsf@benfinney.id.au> Message-ID: <4EA3A4D6.3020204@pearwood.info> Nick Coghlan wrote: [...] > The current convention is that classes, functions and modules, don't > offer a shorthand "pretty" display format at all. The proposal is to > specifically bless "x.__name__" as an official shorthand. I believe you are saying that backwards. The way to get the shorthand display format (i.e. the object's name) is already to use x.__name__. There's no need for a proposal to bless x.__name__ as the way to do it since that's already what people do. This proposal is to bless str(x) (and equivalent forms) as a shorthand for x.__name__, and *unbless* x.__name__ as the official way to do it. I expect that's what you mean. For the avoidance of doubt, this only applies to x a module, function or class (including built-in types) but not necessarily any other kind of object. [...] > I think the most valid objection raised so far is the fact that some > metaclasses *already* override __str__ to display something other than > the result of type.__repr__(cls). That does have the potential to > cause problems for code dealing with utterly unknown types. However, > such code should likely be explicitly invoking repr() rather than > str() anyway, so this change is unlikely to break anything that isn't > already broken. If this proposal goes ahead, will str(x) => x.__name__ cease to be an implementation detail and become part of the public API that *must* be supported by all Python implementations? (That is not a rhetorical question.) If not, then other Pythons can return something else and anyone who cares about writing portable, correct code can't use str(x) as shorthand and will still need to use x.__name__. str(x) will then be an attractive nuisance encouraging people to write non-portable code. But if so, that seems rather heavy-handed to me. str() of every other kind of object is an implementation detail[1]. Presumably if there were an Italian Python where str(None) returned "Nessuno", or a Persian Python where str(2) returned "?", that would be allowed. I don't see that classes are special enough to justify making the precise output of str(cls) part of the public API of classes. [1] Although it has to be acknowledged that many objects have an obvious string output that is very unlikely to change. -- Steven From ben+python at benfinney.id.au Sun Oct 23 08:58:16 2011 From: ben+python at benfinney.id.au (Ben Finney) Date: Sun, 23 Oct 2011 17:58:16 +1100 Subject: [Python-ideas] Changing str(someclass) to return only the class name References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> <8739ekpupz.fsf@benfinney.id.au> Message-ID: <87ty70nqif.fsf@benfinney.id.au> Guido van Rossum writes: > On Sat, Oct 22, 2011 at 2:44 PM, Ben Finney wrote: > > Guido van Rossum writes: > >> When I am printing an object and I have no idea what type it is, > >> I'll use repr() or "%r"; but when I know I am printing, say, an > >> exception, I think it would be very nice if print(x) would just > >> print its name. > > > > -1. That makes the string representation of an exception much less > > useful. > > > > Exceptions don't have names; each exception *type* has a name, but that > > doesn't distinguish instances of the type from one another. When there > > is an 'IOError' it's far more useful that the string representation > > contains the exception *message*, since the name of the exception type > > doesn't tell the user much about what went wrong. > > You misunderstood me. I'm not proposing to change the str() of an > exception *instance*. That's what I take to be the meaning of ?print an exception?. Like ?print an int? or ?print a list?, it seems to me that refers not to a type, but to an instance of the type. Thanks for clarifying that you meant ?print an exception class?. -- \ ?I wish there was a knob on the TV to turn up the intelligence. | `\ There's a knob called ?brightness? but it doesn't work.? | _o__) ?Eugene P. Gallagher | Ben Finney From g.brandl at gmx.net Sun Oct 23 11:27:17 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 23 Oct 2011 11:27:17 +0200 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: <4EA39768.6080301@pearwood.info> References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <20111022233349.586e9048@pitrou.net> <4EA39768.6080301@pearwood.info> Message-ID: On 10/23/11 06:26, Steven D'Aprano wrote: > Antoine Pitrou wrote: > [...] >> It might be the integer 1 or the string "1". You need repr() to tell >> the difference. > > Antoine, that's a cheap shot. The *very next paragraph* of my post which > you replied to, and which you snipped out of your response, says: > > I am aware that the string representation (using either __str__ > or __repr__) of an object is not the definitive word in what > the object really is. Using just print, one can't distinguish > between a class and a string "", or for > that matter between the string 2 and the int 2, and that > arbitrary objects can return arbitrary strings. Nevertheless, > I like the __str__ of classes, modules and functions just the > way they are. > > > Can we please stop throwing up this red herring? Well, if you stop throwing the red herring of "but you can't decide between strings and classes anymore". By your own quote you're now down to "but I like it the way it is", which is fine as a -1 in the "voting", but please don't make it seem like you have subjective arguments. Georg From masklinn at masklinn.net Sun Oct 23 12:08:56 2011 From: masklinn at masklinn.net (Masklinn) Date: Sun, 23 Oct 2011 12:08:56 +0200 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <20111022233349.586e9048@pitrou.net> <4EA39768.6080301@pearwood.info> Message-ID: <42D11041-E4F9-480A-BFE7-8419C849D0F1@masklinn.net> On 2011-10-23, at 11:27 , Georg Brandl wrote: > Well, if you stop throwing the red herring of "but you can't decide between > strings and classes anymore". > > By your own quote you're now down to "but I like it the way it is", > which is fine as a -1 in the "voting", but please don't make it seem like > you have subjective arguments. Shouldn't the second to last word be "objective"? "I like the current one best" sounds like a subjective argument to me. From g.brandl at gmx.net Sun Oct 23 12:13:32 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 23 Oct 2011 12:13:32 +0200 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <20111022233349.586e9048@pitrou.net> <4EA39768.6080301@pearwood.info> Message-ID: On 10/23/11 11:27, Georg Brandl wrote: > On 10/23/11 06:26, Steven D'Aprano wrote: >> Antoine Pitrou wrote: >> [...] >>> It might be the integer 1 or the string "1". You need repr() to tell >>> the difference. >> >> Antoine, that's a cheap shot. The *very next paragraph* of my post which >> you replied to, and which you snipped out of your response, says: >> >> I am aware that the string representation (using either __str__ >> or __repr__) of an object is not the definitive word in what >> the object really is. Using just print, one can't distinguish >> between a class and a string "", or for >> that matter between the string 2 and the int 2, and that >> arbitrary objects can return arbitrary strings. Nevertheless, >> I like the __str__ of classes, modules and functions just the >> way they are. >> >> >> Can we please stop throwing up this red herring? > > Well, if you stop throwing the red herring of "but you can't decide between > strings and classes anymore". > > By your own quote you're now down to "but I like it the way it is", > which is fine as a -1 in the "voting", but please don't make it seem like > you have subjective arguments. s/subjective/objective/, of course :) Georg From Nikolaus at rath.org Sun Oct 23 16:34:49 2011 From: Nikolaus at rath.org (Nikolaus Rath) Date: Sun, 23 Oct 2011 10:34:49 -0400 Subject: [Python-ideas] Avoiding nested for try..finally: atexit for functions? In-Reply-To: <20111022002258.GB7138@chopin.edu.pl> References: <87pqhtafrz.fsf@vostro.rath.org> <20111019040214.GA5524@flay.puzzling.org> <87lishccj3.fsf@inspiron.ap.columbia.edu> <87fwipca9k.fsf@inspiron.ap.columbia.edu> <20111020001922.GA2191@chopin.edu.pl> <87mxcw1k2p.fsf@vostro.rath.org> <20111022002258.GB7138@chopin.edu.pl> Message-ID: <4EA42609.9040701@rath.org> On 10/21/2011 08:22 PM, Jan Kaliszewski wrote: > An improved (and at the same time simplified) implementation (being also > a recipe for Python 2.x, though this list is about ideas for Py3.x): > > class CleanupManager(object): > > def __init__(self, initial_callbacks=()): > self.cleanup_callbacks = list(initial_callbacks) > > def register(self, callback, *args, **kwargs): > self.cleanup_callbacks.append((callback, args, kwargs)) > > def __enter__(self): > return self > > def __exit__(self, exc_type, exc, tb): > self._next_callback() > > def _next_callback(self): > if self.cleanup_callbacks: > callback, args, kwargs = self.cleanup_callbacks.pop() > try: > callback(*args, **kwargs) > finally: > # all cleanup callbacks to be used > # Py3.x: all errors to be reported > self._next_callback() > > I hope it implements well what you explained... I'm not sure if it is > worth to be added to the standard library (in the case of your primary > example I'd rather prefer that try-finally nested structure) -- though > in some cases it may become really useful: It implements almost exactly what I need. I will use it in a slightly modified form so that exceptions in the cleanup handlers are logged and discarded, so that they original exception is preserved (can't switch to Python 3 before pycryptopp becomes Py3 compatible). Who decides if it's going into stdlib? I'm of course in favor, but I feel that my opinion may not count that much and, in addition to that, be highly biased :-). Thanks, -Nikolaus -- ?Time flies like an arrow, fruit flies like a Banana.? PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C From solipsis at pitrou.net Sun Oct 23 17:05:18 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 23 Oct 2011 17:05:18 +0200 Subject: [Python-ideas] Changing str(someclass) to return only the class name References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <20111022233349.586e9048@pitrou.net> <4EA39768.6080301@pearwood.info> Message-ID: <20111023170518.72e8d737@pitrou.net> On Sun, 23 Oct 2011 15:26:16 +1100 Steven D'Aprano wrote: > Antoine Pitrou wrote: > [...] > > It might be the integer 1 or the string "1". You need repr() to tell > > the difference. > > Antoine, that's a cheap shot. The *very next paragraph* of my post which > you replied to, and which you snipped out of your response, says: Woops, sorry, I had missed it :( Regards Antoine. From guido at python.org Sun Oct 23 18:04:43 2011 From: guido at python.org (Guido van Rossum) Date: Sun, 23 Oct 2011 09:04:43 -0700 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: <4EA3A4D6.3020204@pearwood.info> References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> <8739ekpupz.fsf@benfinney.id.au> <4EA3A4D6.3020204@pearwood.info> Message-ID: On Sat, Oct 22, 2011 at 10:23 PM, Steven D'Aprano wrote: > Nick Coghlan wrote: > [...] >> >> The current convention is that classes, functions and modules, don't >> offer a shorthand "pretty" display format at all. The proposal is to >> specifically bless "x.__name__" as an official shorthand. > > I believe you are saying that backwards. > > The way to get the shorthand display format (i.e. the object's name) is > already to use x.__name__. There's no need for a proposal to bless > x.__name__ as the way to do it since that's already what people do. > > This proposal is to bless str(x) (and equivalent forms) as a shorthand for > x.__name__, and *unbless* x.__name__ as the official way to do it. I expect > that's what you mean. That is putting words in my mouth. There's no intent to unbless x.__name__, period. Also, there's no intent to bless str(x) as x.__name__ if you want the name. The only intent is to make what gets printed if you print a function, class or module to be less verbose. > For the avoidance of doubt, this only applies to x a module, function or > class (including built-in types) but not necessarily any other kind of > object. > > [...] >> >> I think the most valid objection raised so far is the fact that some >> metaclasses *already* override __str__ to display something other than >> the result of type.__repr__(cls). That does have the potential to >> cause problems for code dealing with utterly unknown types. However, >> such code should likely be explicitly invoking repr() rather than >> str() anyway, so this change is unlikely to break anything that isn't >> already broken. > > If this proposal goes ahead, will str(x) => x.__name__ cease to be an > implementation detail and become part of the public API that *must* be > supported by all Python implementations? > > (That is not a rhetorical question.) str() is never an implementation detail. The expectation is that str() is roughly compatible on different Python implementations. If it was an implementation detail, I wouldn't have sustained this discussion for so long, I'd just committed the change. At the same time I see no reason for other implementations not to follow CPython for this particular proposal, and if they disagree they should argue here, not refuse to implement the change. > If not, then other Pythons can return something else and anyone who cares > about writing portable, correct code can't use str(x) as shorthand and will > still need to use x.__name__. str(x) will then be an attractive nuisance > encouraging people to write non-portable code. I assume that people will continue to understand that str()'s contract is weak -- it produces a pretty string. The only place where its contract is stronger is for some specific types, like str itself (str() of a string is that string itself, period) and for integers and floats (a decimal representation). We're not changing the strength of the contract here. We're just making it prettier. > But if so, that seems rather heavy-handed to me. str() of every other kind > of object is an implementation detail[1]. Presumably if there were an > Italian Python where str(None) returned "Nessuno", or a Persian Python where > str(2) returned "?", that would be allowed. I don't see that classes are > special enough to justify making the precise output of str(cls) part of the > public API of classes. It seems to me you are reading too much into the proposal. Please back down. The sky is not falling. > [1] Although it has to be acknowledged that many objects have an obvious > string output that is very unlikely to change. -- --Guido van Rossum (python.org/~guido) From guido at python.org Sun Oct 23 18:07:05 2011 From: guido at python.org (Guido van Rossum) Date: Sun, 23 Oct 2011 09:07:05 -0700 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: <87ty70nqif.fsf@benfinney.id.au> References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> <8739ekpupz.fsf@benfinney.id.au> <87ty70nqif.fsf@benfinney.id.au> Message-ID: On Sat, Oct 22, 2011 at 11:58 PM, Ben Finney wrote: > Guido van Rossum writes: > >> On Sat, Oct 22, 2011 at 2:44 PM, Ben Finney wrote: >> > Guido van Rossum writes: >> >> When I am printing an object and I have no idea what type it is, >> >> I'll use repr() or "%r"; but when I know I am printing, say, an >> >> exception, I think it would be very nice if print(x) would just >> >> print its name. >> > >> > -1. That makes the string representation of an exception much less >> > useful. >> > >> > Exceptions don't have names; each exception *type* has a name, but that >> > doesn't distinguish instances of the type from one another. When there >> > is an 'IOError' it's far more useful that the string representation >> > contains the exception *message*, since the name of the exception type >> > doesn't tell the user much about what went wrong. >> >> You misunderstood me. I'm not proposing to change the str() of an >> exception *instance*. > > That's what I take to be the meaning of ?print an exception?. Like > ?print an int? or ?print a list?, it seems to me that refers not to a > type, but to an instance of the type. > > Thanks for clarifying that you meant ?print an exception class?. Sorry, reading back what I wrote with less context it's clear how you could misread it. Since the proposal clearly started out limited to classes, functions and modules, the ambiguity never occurred to me. -- --Guido van Rossum (python.org/~guido) From bruce at leapyear.org Sun Oct 23 20:37:14 2011 From: bruce at leapyear.org (Bruce Leban) Date: Sun, 23 Oct 2011 11:37:14 -0700 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> <8739ekpupz.fsf@benfinney.id.au> <4EA3A4D6.3020204@pearwood.info> Message-ID: One advantage of the way it works now is that if you have a class, function or module when you're not expecting it, print tells you what's going on. Compare these: >>> print('9'.isdigit) isdigit vs >>> print('9'.isdigit) --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sun Oct 23 22:26:23 2011 From: guido at python.org (Guido van Rossum) Date: Sun, 23 Oct 2011 13:26:23 -0700 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> <8739ekpupz.fsf@benfinney.id.au> <4EA3A4D6.3020204@pearwood.info> Message-ID: On Sun, Oct 23, 2011 at 11:37 AM, Bruce Leban wrote: > One advantage of the way it works now is that if you have a class, function > or module when you're not expecting it, print tells you what's going on. > Compare these: > >>>> print('9'.isdigit) > isdigit > > vs > >>>> print('9'.isdigit) > Fortunately, that particular example won't change, because '9'.isdigit is not a function -- it is a bound method. They're different object types. -- --Guido van Rossum (python.org/~guido) From fuzzyman at gmail.com Sun Oct 23 22:44:13 2011 From: fuzzyman at gmail.com (Michael Foord) Date: Sun, 23 Oct 2011 21:44:13 +0100 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> Message-ID: On 22 October 2011 21:32, Guido van Rossum wrote: > On Sat, Oct 22, 2011 at 1:18 PM, Steven D'Aprano > wrote: > > Nick Coghlan wrote: > > > >> If someone wants debugging level detail, use repr(), just like the > >> interactive interpreter does. > > > > I'm just going to repeat what I've said before: explicit is better than > > implicit. If you want the name of an object (be it a class, a module, a > > function, or something else), you should explicitly ask for the name, and > > not rely on its str(). > > > > The details returned by str() are, in some sense, arbitrary. The docs > > describe it as [quote] the ?informal? string representation of an object > > [end quote]. > > > > http://docs.python.org/reference/datamodel.html#object.__str__ > > > > On that basis, objects are free to return as much, or as little, > information > > as makes sense in their str(). (As you pointed out earlier.) > > > > However, the docs also say that str() should return [quote] a string > > containing a nicely printable representation of an object [end quote]. > > > > http://docs.python.org/library/functions.html#str > > > > To my mind, the name alone of a class (or function or module) is in no > sense > > a nicely printable representation of the object. I would argue strongly > that > > the property of being "nicely representable" outweighs by far the > > convenience of avoiding 9 extra characters in one specific use-case: > > > > "blah blah blah class '%s'" % cls # instead of cls.__name__ > > > > > > But for the sake of the argument, I'll grant you that we're free to > change > > str(cls) to return the class name, as requested by the OP, or the fully > > qualified module.class dotted name as suggested by you. So let's suppose > > that, after a long and bitter debate over which colour to paint this > > bikeshed, you win the debate. > > > > But this doesn't help you at all, because you can't rely on it. It seems > to > > me that the exact format of str(cls) is an implementation detail. You > can't > > rely on other Pythons to do the same thing, nor can you expect a > guarantee > > that str(cls) won't change again in the future. So if you care about the > > exact string that gets generated, you still have to explicitly use > > cls.__name__ just as you do now. > > > > The __name__ attribute is part of the guaranteed API of class objects > (and > > also functions and modules), the output of str(cls) is not. In my opinion > > relying on it to return a particular output is dangerous, regardless of > > whether the output is "", "module.MyClass", > > "MyClass" or something else. > > > > Having str(cls) return just the class name (or the module.class dotted > name) > > is an attractive nuisance that should be resisted. > > Thinking of str(x) as an API to get a certain value would lead there, > yes. But thinking of str(x) as what gets printed by print(x), > formatted by "{}.format(x)", and "%s" % s, changes things. When I am > printing an object and I have no idea what type it is, I'll use repr() > or "%r"; but when I know I am printing, say, an exception, I think it > would be very nice if print(x) would just print its name. Just like > print(None) prints 'None', it would make all the sense in the world if > print(ZeroDivisionError) printed 'ZeroDivisionError', and > print(type(42)) printed 'int'. > > +1 Michael > -- > --Guido van Rossum (python.org/~guido ) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From bborcic at gmail.com Mon Oct 24 16:07:39 2011 From: bborcic at gmail.com (Boris Borcic) Date: Mon, 24 Oct 2011 16:07:39 +0200 Subject: [Python-ideas] PEP 355 (overloading boolean operations) and chained comparisons In-Reply-To: References: Message-ID: Raymond Hettinger wrote: > > Extended slicing and ellipsis tricks weren't so bad because they were easily > ignored by general users. Don't forget complex numbers, added simultaneously, meshing very well, and not deserving the name of "trick" imo. -- BB From carl at oddbird.net Mon Oct 24 20:21:07 2011 From: carl at oddbird.net (Carl Meyer) Date: Mon, 24 Oct 2011 12:21:07 -0600 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib Message-ID: <4EA5AC93.2020305@oddbird.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi all, Vinay Sajip and I are working on a PEP for making "virtual Python environments" a la virtualenv [1] a built-in feature of Python 3.3. This idea was first proposed on python-dev by Ian Bicking in February 2010 [2]. It was revived at PyCon 2011 and has seen discussion on distutils-sig [3] and more recently again on python-dev [4] [5]. Given all this (mostly positive) prior discussion, we may be at a point where further discussion should happen on python-dev rather than python-ideas. But in order to observe the proper PEP 1 process, I'm posting the draft PEP here first for pre-review and comment before I send it to the PEP editors and post it on python-dev. Full text of the draft PEP is pasted below, and also available on Bitbucket [6]. [1] http://virtualenv.org [2] http://mail.python.org/pipermail/python-dev/2010-February/097787.html [3] http://mail.python.org/pipermail/distutils-sig/2011-March/017498.html [4] http://mail.python.org/pipermail/python-dev/2011-June/111903.html [5] http://mail.python.org/pipermail/python-dev/2011-October/113883.html [6] https://bitbucket.org/carljm/pythonv-pep/src/ PEP: XXX Title: Python Virtual Environments Version: $Revision$ Last-Modified: $Date$ Author: Carl Meyer Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 13-Jun-2011 Python-Version: 3.3 Post-History: 14-Jun-2011 Abstract ======== This PEP proposes to add to Python a mechanism for lightweight "virtual environments" with their own site directories, optionally isolated from system site directories. Each virtual environment has its own Python binary (allowing creation of environments with various Python versions) and can have its own independent set of installed Python packages in its site directories. Motivation ========== The utility of Python virtual environments has already been well established by the popularity of existing third-party virtual-environment tools, primarily Ian Bicking's `virtualenv`_. Virtual environments are already widely used for dependency management and isolation, ease of installing and using Python packages without system-administrator access, and automated testing of Python software across multiple Python versions, among other uses. Existing virtual environment tools suffer from lack of support from the behavior of Python itself. Tools such as `rvirtualenv`_, which do not copy the Python binary into the virtual environment, cannot provide reliable isolation from system site directories. Virtualenv, which does copy the Python binary, is forced to duplicate much of Python's ``site`` module and manually copy an ever-changing set of standard-library modules into the virtual environment in order to perform a delicate boot-strapping dance at every startup. The ``PYTHONHOME`` environment variable, Python's only existing built-in solution for virtual environments, requires copying the entire standard library into every environment; not a lightweight solution. A virtual environment mechanism integrated with Python and drawing on years of experience with existing third-party tools can be lower maintenance, more reliable, and more easily available to all Python users. .. _virtualenv: http://www.virtualenv.org .. _rvirtualenv: https://github.com/kvbik/rvirtualenv Specification ============= When the Python binary is executed, it attempts to determine its prefix (which it stores in ``sys.prefix``), which is then used to find the standard library and other key files, and by the ``site`` module to determine the location of the site-package directories. Currently the prefix is found (assuming ``PYTHONHOME`` is not set) by first walking up the filesystem tree looking for a marker file (``os.py``) that signifies the presence of the standard library, and if none is found, falling back to the build-time prefix hardcoded in the binary. This PEP proposes to add a new first step to this search. If an ``env.cfg`` file is found either adjacent to the Python executable, or one directory above it, this file is scanned for lines of the form ``key = value``. If a ``home`` key is found, this signifies that the Python binary belongs to a virtual environment, and the value of the ``home`` key is the directory containing the Python executable used to create this virtual environment. In this case, prefix-finding continues as normal using the value of the ``home`` key as the effective Python binary location, which results in ``sys.prefix`` being set to the system installation prefix, while ``sys.site_prefix`` is set to the directory containing ``env.cfg``. (If ``env.cfg`` is not found or does not contain the ``home`` key, prefix-finding continues normally, and ``sys.site_prefix`` will be equal to ``sys.prefix``.) The ``site`` and ``sysconfig`` standard-library modules are modified such that site-package directories ("purelib" and "platlib", in ``sysconfig`` terms) are found relative to ``sys.site_prefix``, while other directories (the standard library, include files) are still found relative to ``sys.prefix``. Thus, a Python virtual environment in its simplest form would consist of nothing more than a copy of the Python binary accompanied by an ``env.cfg`` file and a site-packages directory. Since the ``env.cfg`` file can be located one directory above the executable, a typical virtual environment layout, mimicking a system install layout, might be:: env.cfg bin/python3 lib/python3.3/site-packages/ Isolation from system site-packages - ----------------------------------- In a virtual environment, the ``site`` module will normally still add the system site directories to ``sys.path`` after the virtual environment site directories. Thus system-installed packages will still be importable, but a package of the same name installed in the virtual environment will take precedence. If the ``env.cfg`` file also contains a key ``include-system-site`` with a value of ``false`` (not case sensitive), the ``site`` module will omit the system site directories entirely. This allows the virtual environment to be entirely isolated from system site-packages. Creating virtual environments - ----------------------------- This PEP also proposes adding a new ``venv`` module to the standard library which implements the creation of virtual environments. This module would typically be executed using the ``-m`` flag:: python3 -m venv /path/to/new/virtual/environment Running this command creates the target directory (creating any parent directories that don't exist already) and places an ``env.cfg`` file in it with a ``home`` key pointing to the Python installation the command was run from. It also creates a ``bin/`` (or ``Scripts`` on Windows) subdirectory containing a copy of the ``python3`` executable, and the ``pysetup3`` script from the ``packaging`` standard library module (to facilitate easy installation of packages from PyPI into the new virtualenv). And it creates an (initially empty) ``lib/pythonX.Y/site-packages`` subdirectory. If the target directory already exists an error will be raised, unless the ``--clear`` option was provided, in which case the target directory will be deleted and virtual environment creation will proceed as usual. If ``venv`` is run with the ``--no-site-packages`` option, the key ``include-system-site = false`` is also included in the created ``env.cfg`` file. Multiple paths can be given to ``venv``, in which case an identical virtualenv will be created, according to the given options, at each provided path. API - --- The high-level method described above will make use of a simple API which provides mechanisms for third-party virtual environment creators to customize environment creation according to their needs. The ``venv`` module will contain an ``EnvBuilder`` class which accepts the following keyword arguments on instantiation:: * ``nosite`` - A Boolean value indicating that isolation of the environment from the system Python is required (defaults to ``False``). * ``clear`` - A Boolean value which, if True, will delete any existing target directory instead of raising an exception (defaults to ``False``). The returned env-builder is an object which is expected to have a single method, ``create``, which takes as required argument the path (absolute or relative to the current directory) of the target directory which is to contain the virtual environment. The ``create`` method will either create the environment in the specified directory, or raise an appropriate exception. Creators of third-party virtual environment tools will be free to use the provided ``EnvBuilder`` class as a base class. The ``venv`` module will also provide a module-level function as a convenience:: def create(env_dir, nosite=False, clear=False): builder = EnvBuilder(nosite=nosite, clear=clear) builder.create(env_dir) The ``create`` method of the ``EnvBuilder`` class illustrates the hooks available for customization: def create(self, env_dir): """ Create a virtualized Python environment in a directory. :param env_dir: The target directory to create an environment in. """ env_dir = os.path.abspath(env_dir) context = self.create_directories(env_dir) self.create_configuration(context) self.setup_python(context) self.setup_packages(context) self.setup_scripts(context) Each of the methods ``create_directories``, ``create_configuration``, ``setup_python``, ``setup_packages`` and ``setup_scripts`` can be overridden. The functions of these methods are:: * ``create_directories`` - creates the environment directory and all necessary directories, and returns a context object. This is just a holder for attributes (such as paths), for use by the other methods. * ``create_configuration`` - creates the ``env.cfg`` configuration file in the environment. * ``setup_python`` - creates a copy of the Python executable (and, under Windows, DLLs) in the environment. * ``setup_packages`` - A placeholder method which can be overridden in third party implementations to pre-install packages in the virtual environment. * ``setup_scripts`` - A placeholder methd which can be overridden in third party implementations to pre-install scripts (such as activation and deactivation scripts) in the virtual environment. The ``DistributeEnvBuilder`` subclass in the reference implementation illustrates how these last two methods can be used in practice. It's not envisaged that ``DistributeEnvBuilder`` will be actually added to Python core, but it makes the reference implementation more immediately useful for testing and exploratory purposes. * The ``setup_packages`` method installs Distribute in the target environment. This is needed at the moment in order to actually install most packages in an environment, since most packages are not yet packaging / setup.cfg based. * The ``setup_scripts`` method installs activation and pysetup3 scripts in the environment. This is also done in a configurable way: A ``scripts`` property on the builder is expected to provide a buffer which is a base64-encoded zip file. The zip file contains directories "common", "linux2", "darwin", "win32", each containing scripts destined for the bin directory in the environment. The contents of "common" and the directory corresponding to ``sys.platform`` are copied after doing some text replacement of placeholders: * ``__VIRTUAL_ENV__`` is replaced with absolute path of the environment directory. * ``__VIRTUAL_PROMPT__`` is replaced with the environment prompt prefix. * ``__BIN_NAME__`` is replaced with the name of the bin directory. * ``__ENV_PYTHON__`` is replaced with the absolute path of the environment's executable. No doubt the process of PEP review will show up any customization requirements which have not yet been considered. Open Questions ============== Why not modify sys.prefix? - -------------------------- Any virtual environment tool along these lines is proposing a split between two different meanings (among others) that are currently both wrapped up in ``sys.prefix``: the answers to the questions "Where is the standard library?" and "Where is the site-packages location where third-party modules should be installed?" This split could be handled by introducing a new value for either the former question or the latter question. Either option potentially introduces some backwards-incompatibility with software written to assume the other meaning for ``sys.prefix``. Since it was unable to modify `distutils`, `virtualenv`_ has to re-point ``sys.prefix`` at the virtual environment, which requires that it also provide a symlink from inside the virtual environment to the Python header files, and that it copy some portions of the standard library into the virtual environment. The `documentation`__ for ``sys.prefix`` describes it as "A string giving the site-specific directory prefix where the platform independent Python files are installed," and specifically mentions the standard library and header files as found under ``sys.prefix``. It does not mention ``site-packages``. __ http://docs.python.org/dev/library/sys.html#sys.prefix It is more true to this documented definition of ``sys.prefix`` to leave it pointing to the system installation (which is where the standard library and header files are found), and introduce a new value in ``sys`` (``sys.site_prefix``) to point to the prefix for ``site-packages``. The justification for reversing this choice would be if it can be demonstrated that the bulk of third-party code referencing ``sys.prefix`` is, in fact, using it to find ``site-packages``, and not the standard library or header files or anything else. The most notable case is probably `setuptools`_ and its fork `distribute`_, which do use ``sys.prefix`` to build up a list of site directories for pre-flight checking where ``pth`` files can usefully be placed. It would be trivial to modify these tools (currently only `distribute`_ is Python 3 compatible) to check ``sys.site_prefix`` and fall back to ``sys.prefix`` if it doesn't exist. If Distribute is modified in this way and released before Python 3.3 is released with the ``venv`` module, there would be no likely reason for an older version of Distribute to ever be installed in a virtual environment. In terms of other third-party usage, a `Google Code Search`_ turns up what appears to be a roughly even mix of usage between packages using ``sys.prefix`` to build up a site-packages path and packages using it to e.g. eliminate the standard-library from code-execution tracing. Either choice that's made here will require one or the other of these uses to be updated. Another argument for reversing this choice and modifying ``sys.prefix`` to point at the virtual environment is that virtualenv currently does this, and it doesn't appear to have caused major problems. .. _setuptools: http://peak.telecommunity.com/DevCenter/setuptools .. _distribute: http://packages.python.org/distribute/ .. _Google Code Search: http://www.google.com/codesearch#search/&q=sys\.prefix&p=1&type=cs What about include files? - ------------------------- For example, ZeroMQ installs zmq.h and zmq_utils.h in $VE/include, whereas SIP (part of PyQt4) installs sip.h by default in $VE/include/pythonX.Y. With virtualenv, everything works because the PythonX.Y include is symlinked, so everything that's needed is in $VE/include. At the moment pythonv doesn't do anything with include files, besides creating the include directory; this might need to change, to copy/symlink $VE/include/pythonX.Y. I guess this would go into ``venv.py``. As in Python there's no abstraction for a site-specific include directory, other than for platform-specific stuff, then the user expectation would seem to be that all include files anyone could ever want should be found in one of just two locations, with sysconfig labels "include" & "platinclude". There's another issue: what if includes are Python-version-specific? For example, SIP installs by default into $VE/include/pythonX.Y rather than $VE/include, presumably because there's version-specific stuff in there - but even if that's not the case with SIP, it could be the case with some other package. And the problem that gives is that you can't just symlink the include/pythonX.Y directory, but actually have to provide a writable directory and symlink/copy the contents from the system include/pythonX.Y. Of course this is not hard to do, but it does seem inelegant. OTOH it's really because there's no supporting concept in Python/sysconfig. Interface with packaging tools - ------------------------------ Some work will be needed in packaging tools (Python 3.3 packaging, Distribute) to support implementation of this PEP. For example: * How Distribute and packaging use sys.prefix and/or sys.site_prefix. Clearly, in practice we'll need to use Distribute for a while, until packages have migrated over to usage of setup.cfg. * How packaging and Distribute set up shebang lines in scripts which they install in virtual environments. Add a script? - ------------- Perhaps a ``pyvenv`` script should be added as a more convienent and discoverable alternative to ``python -m venv``. Testability and Source Build Issues - ----------------------------------- In order to be able to test the ``venv`` module in the Python regression test suite, some anomalies in how sysconfig data is configured in source builds will need to be removed. For example, sysconfig.get_paths() in a source build gives (partial output): { 'include': '/home/vinay/tools/pythonv/Include', 'libdir': '/usr/lib ; or /usr/lib64 on a multilib system', 'platinclude': '/home/vinay/tools/pythonv', 'platlib': '/usr/local/lib/python3.3/site-packages', 'platstdlib': '/usr/local/lib/python3.3', 'purelib': '/usr/local/lib/python3.3/site-packages', 'stdlib': '/usr/local/lib/python3.3' } Activation and Utility Scripts - ------------------------------ Virtualenv currently provides shell "activation" scripts as a user convenience, to put the virtual environment's Python binary first on the shell PATH. This is a maintenance burden, as separate activation scripts need to be provided and maintained for every supported shell. For this reason, this PEP proposes to leave such scripts to be provided by third-party extensions; virtual environments created by the core functionality would be used by directly invoking the environment's Python binary. If we are going to rely on external code to provide these conveniences, we need to check with existing third-party projects in this space (virtualenv, zc.buildout) and ensure that the proposed API meets their needs. (Virtualenv would be fine with the proposed API; it would become a relatively thin wrapper with a subclass of the env builder that adds shell activation and automatic installation of ``pip`` inside the virtual environment). Ensuring that sys.site_prefix and sys.site_exec_prefix are always set? - ---------------------------------------------------------------------- Currently the reference implementation's modifications to standard library code use the idiom ``getattr(sys, "site_prefix", sys.prefix)``. Do we want this to be the long-term pattern, or should the sys module ensure that the ``site_*`` attributes are always set to something (by default the same as the regular prefix attributes), even if ``site.py`` does not run? Reference Implementation ======================== The in-progress reference implementation is found in `a clone of the CPython Mercurial repository`_. To test it, build and install it (the virtual environment tool currently does not run from a source tree). - From the installed Python, run ``bin/python3 -m venv /path/to/new/virtualenv`` to create a virtual environment. The reference implementation (like this PEP!) is a work in progress. .. _a clone of the CPython Mercurial repository: https://bitbucket.org/vinay.sajip/pythonv References ========== Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6lrJMACgkQ8W4rlRKtE2dz4wCgqxtiHQr3ZEH/s1h069e15bu7 c70AoOSTd7drIp1g6z2QiuDKoTok6TRw =9XEL -----END PGP SIGNATURE----- From ncoghlan at gmail.com Tue Oct 25 00:11:30 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 25 Oct 2011 08:11:30 +1000 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: <4EA5AC93.2020305@oddbird.net> References: <4EA5AC93.2020305@oddbird.net> Message-ID: On Tue, Oct 25, 2011 at 4:21 AM, Carl Meyer wrote: > Existing virtual environment tools suffer from lack of support from > the behavior of Python itself. ?Tools such as `rvirtualenv`_, which do > not copy the Python binary into the virtual environment, cannot > provide reliable isolation from system site directories. ?Virtualenv, > which does copy the Python binary, is forced to duplicate much of > Python's ``site`` module and manually copy an ever-changing set of > standard-library modules into the virtual environment in order to > perform a delicate boot-strapping dance at every startup. The > ``PYTHONHOME`` environment variable, Python's only existing built-in > solution for virtual environments, requires copying the entire > standard library into every environment; not a lightweight solution. The repeated references to copying binaries and Python files throughout the PEP is annoying, and needs to be justified. Python 3.2+ supports symlinking on Windows Vista and above as well as on *nix systems, so there needs to be a short section somewhere explaining why symlinks are not an adequate lightweight solution (pointing out the fact that creating symlinks on Windows often requires administrator privileges would be sufficient justification for me). (Obviously, 3rd party virtual environment solutions generally *do* have to copy things around, since Python 2.x doesn't expose symlink support on Windows at all) > A virtual environment mechanism integrated with Python and drawing on > years of experience with existing third-party tools can be lower > maintenance, more reliable, and more easily available to all Python > users. It can also take advantage of the native symlink support to minimise copying. > Specification > ============= > > When the Python binary is executed, it attempts to determine its > prefix (which it stores in ``sys.prefix``), which is then used to find > the standard library and other key files, and by the ``site`` module > to determine the location of the site-package directories. ?Currently > the prefix is found (assuming ``PYTHONHOME`` is not set) by first > walking up the filesystem tree looking for a marker file (``os.py``) > that signifies the presence of the standard library, and if none is > found, falling back to the build-time prefix hardcoded in the binary. > > This PEP proposes to add a new first step to this search. ?If an > ``env.cfg`` file is found either adjacent to the Python executable, or > one directory above it, this file is scanned for lines of the form > ``key = value``. If a ``home`` key is found, this signifies that the > Python binary belongs to a virtual environment, and the value of the > ``home`` key is the directory containing the Python executable used to > create this virtual environment. Currently, the PEP uses a mish-mash of 'env', 'venv', 'pyvenv' and 'pythonv' and 'site' to refer to different aspects of the proposed feature. I suggest standardising on 'venv' wherever the Python relationship is implied, and 'pyvenv' wherever the Python relationship needs to be made explicit. So I think the name of the configuration file should be "pyvenv.cfg" (tangent: 'setup' lost its Python connection when it went from 'setup.py' to 'setup.cfg'. I wish the latter has been called 'pysetup.cfg' instead) > In this case, prefix-finding continues as normal using the value of > the ``home`` key as the effective Python binary location, which > results in ``sys.prefix`` being set to the system installation prefix, > while ``sys.site_prefix`` is set to the directory containing > ``env.cfg``. > > (If ``env.cfg`` is not found or does not contain the ``home`` key, > prefix-finding continues normally, and ``sys.site_prefix`` will be > equal to ``sys.prefix``.) 'site' is *way* too overloaded already, let's not make it worse. I suggest "sys.venv_prefix". > The ``site`` and ``sysconfig`` standard-library modules are modified > such that site-package directories ("purelib" and "platlib", in > ``sysconfig`` terms) are found relative to ``sys.site_prefix``, while > other directories (the standard library, include files) are still > found relative to ``sys.prefix``. > > Thus, a Python virtual environment in its simplest form would consist > of nothing more than a copy of the Python binary accompanied by an > ``env.cfg`` file and a site-packages directory. ?Since the ``env.cfg`` > file can be located one directory above the executable, a typical > virtual environment layout, mimicking a system install layout, might > be:: > > ? ?env.cfg > ? ?bin/python3 > ? ?lib/python3.3/site-packages/ The builtin virtual environment mechanism should be specified to symlink things by default, and only copy things if the user specifically requests it. System administrators rightly fear the proliferation of multiple copies of binaries, since it can cause major hassles when it comes time to install security updates. > Isolation from system site-packages > - ----------------------------------- > > In a virtual environment, the ``site`` module will normally still add > the system site directories to ``sys.path`` after the virtual > environment site directories. ?Thus system-installed packages will > still be importable, but a package of the same name installed in the > virtual environment will take precedence. > > If the ``env.cfg`` file also contains a key ``include-system-site`` > with a value of ``false`` (not case sensitive), the ``site`` module > will omit the system site directories entirely. This allows the > virtual environment to be entirely isolated from system site-packages. "site" is ambiguous here - rather than abbreviating, I suggest making the option "include-system-site-packages". > Creating virtual environments > - ----------------------------- > > This PEP also proposes adding a new ``venv`` module to the standard > library which implements the creation of virtual environments. ?This > module would typically be executed using the ``-m`` flag:: > > ? ?python3 -m venv /path/to/new/virtual/environment > > Running this command creates the target directory (creating any parent > directories that don't exist already) and places an ``env.cfg`` file > in it with a ``home`` key pointing to the Python installation the > command was run from. ?It also creates a ``bin/`` (or ``Scripts`` on > Windows) subdirectory containing a copy of the ``python3`` executable, > and the ``pysetup3`` script from the ``packaging`` standard library > module (to facilitate easy installation of packages from PyPI into the > new virtualenv). ?And it creates an (initially empty) > ``lib/pythonX.Y/site-packages`` subdirectory. As noted above, those should be symlinks rather than copies, with copying behaviour explicitly requested via a command line option. Also, why "Scripts" rather than "bin" on Windows? The Python binary isn't a script. I'm actually not seeing the rationale for the obfuscated FHS inspired layout in the first place - why not dump the binaries adjacent to the config file, with a simple "site-packages" directory immediately below that? If there are reasons for a more complex default layout, they need to be articulated in the PEP. If the problem is wanting to allow cross platform computation of things like the site-packages directory location and other paths, then the answer to that seems to lie in better helper methods (whether in sysconfig, site, venv or elsewhere) rather than Linux specific layouts inside language level virtual environments. > The ``venv`` module will contain an ``EnvBuilder`` class which accepts > the following keyword arguments on instantiation:: > > ? * ``nosite`` - A Boolean value indicating that isolation of the > ? ? environment from the system Python is required (defaults to > ? ? ``False``). Yikes, double negatives in APIs are bad news (especially when the corresponding config file option is expressed positively) I suggest this parameter should be declared as "system_site_packages=True". > ? * ``clear`` - A Boolean value which, if True, will delete any > ? ? existing target directory instead of raising an exception > ? ? (defaults to ``False``). In line with my above comments, I think there should be a third parameter here declared as "use_symlinks=True". Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ethan at stoneleaf.us Tue Oct 25 00:31:39 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 24 Oct 2011 15:31:39 -0700 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: References: <4EA5AC93.2020305@oddbird.net> Message-ID: <4EA5E74B.2060906@stoneleaf.us> Nick Coghlan wrote: > The repeated references to copying binaries and Python files > throughout the PEP is annoying, and needs to be justified. Python 3.2+ > supports symlinking on Windows Vista and above as well as on *nix > systems, so there needs to be a short section somewhere explaining why > symlinks are not an adequate lightweight solution (pointing out the > fact that creating symlinks on Windows often requires administrator > privileges would be sufficient justification for me). Windows Vista?!? Death first! ;) My machines are still using XP -- is that sufficient justification for copying? ~Ethan~ From tjreedy at udel.edu Tue Oct 25 00:41:33 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 24 Oct 2011 18:41:33 -0400 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: <4EA5AC93.2020305@oddbird.net> References: <4EA5AC93.2020305@oddbird.net> Message-ID: On 10/24/2011 2:21 PM, Carl Meyer wrote: > The ``site`` and ``sysconfig`` standard-library modules are modified > such that site-package directories ("purelib" and "platlib", in > ``sysconfig`` terms) are found relative to ``sys.site_prefix``, while > other directories (the standard library, include files) are still > found relative to ``sys.prefix``. FYI To substantially shorten startup time, sysconfig has been recently modified to import as much precomputed info as possible instead of recomputing things at each run. -- Terry Jan Reedy From solipsis at pitrou.net Tue Oct 25 00:41:44 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 25 Oct 2011 00:41:44 +0200 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib References: <4EA5AC93.2020305@oddbird.net> Message-ID: <20111025004144.3bee7579@pitrou.net> On Mon, 24 Oct 2011 18:41:33 -0400 Terry Reedy wrote: > On 10/24/2011 2:21 PM, Carl Meyer wrote: > > > The ``site`` and ``sysconfig`` standard-library modules are modified > > such that site-package directories ("purelib" and "platlib", in > > ``sysconfig`` terms) are found relative to ``sys.site_prefix``, while > > other directories (the standard library, include files) are still > > found relative to ``sys.prefix``. > > FYI To substantially shorten startup time, sysconfig has been recently > modified to import as much precomputed info as possible instead of > recomputing things at each run. The config file is still read every time, though. Regards Antoine. From steve at pearwood.info Tue Oct 25 01:16:27 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 25 Oct 2011 10:16:27 +1100 Subject: [Python-ideas] Statement local functions and classes (aka PEP 3150 is dead, say 'Hi!' to PEP 403) In-Reply-To: <518aa565-21ba-4469-9f6d-8862812c638c@email.android.com> References: <4E9751AF.4070705@canterbury.ac.nz> <1318554600.460.83.camel@Gutsy> <4E9AB9A4.9090905@pearwood.info> <518aa565-21ba-4469-9f6d-8862812c638c@email.android.com> Message-ID: <4EA5F1CB.2000100@pearwood.info> Mike Meyer wrote: >> - And testing. If code isn't tested, you should assume it is buggy. >> In an ideal world, there should never be any such thing as code >> that's used once: it should always be used at least twice, once in >> the application and once in the test suite. I realise that in >> practice we often fall short of that ideal, but we don't need more >> syntax that *encourages* developers to fail to test non-trivial >> code blocks. > > Statement-local namespaces don't do that any more than other > statement that includes a suite does. Or do you avoid if statements > because they encourage you not to test the code in the else clause? if...else blocks aren't being proposed as a way to avoid writing functions. It's not that I think the proposal is bad in and of itself, but I do think it is unnecessary and I fear it will encourage poor practices. -- Steven From steve at pearwood.info Tue Oct 25 01:36:32 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 25 Oct 2011 10:36:32 +1100 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> <8739ekpupz.fsf@benfinney.id.au> <4EA3A4D6.3020204@pearwood.info> Message-ID: <4EA5F680.4030003@pearwood.info> Guido van Rossum wrote: > On Sat, Oct 22, 2011 at 10:23 PM, Steven D'Aprano wrote: >> Nick Coghlan wrote: >> [...] >>> The current convention is that classes, functions and modules, don't >>> offer a shorthand "pretty" display format at all. The proposal is to >>> specifically bless "x.__name__" as an official shorthand. >> I believe you are saying that backwards. >> >> The way to get the shorthand display format (i.e. the object's name) is >> already to use x.__name__. There's no need for a proposal to bless >> x.__name__ as the way to do it since that's already what people do. >> >> This proposal is to bless str(x) (and equivalent forms) as a shorthand for >> x.__name__, and *unbless* x.__name__ as the official way to do it. I expect >> that's what you mean. > > That is putting words in my mouth. There's no intent to unbless > x.__name__, period. Also, there's no intent to bless str(x) as > x.__name__ if you want the name. The only intent is to make what gets > printed if you print a function, class or module to be less verbose. I'm sorry about that, I was describing the proposal as best I understood it. Once this change goes ahead, under what circumstances would you expect people to continue using cls.__name__ (other than for backwards compatibility)? When beginners ask me "how do I get the name of a class?", what answer should I give? For example, I have code that does things like this: raise TypeError('expected a string but got %s' % type(arg).__name__) In the future, I expect that should be written like this: raise TypeError('expected a string but got %s' % type(arg)) That's all I meant by "unbless". I didn't mean to imply that __name__ would go away, only that it would cease to be the One Obvious Way to get the name. If I'm wrong about this, then I'm genuinely confused and don't understand the motivation for this change. -- Steven From ncoghlan at gmail.com Tue Oct 25 01:46:00 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 25 Oct 2011 09:46:00 +1000 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: <4EA5E74B.2060906@stoneleaf.us> References: <4EA5AC93.2020305@oddbird.net> <4EA5E74B.2060906@stoneleaf.us> Message-ID: On Tue, Oct 25, 2011 at 8:31 AM, Ethan Furman wrote: > Nick Coghlan wrote: >> >> The repeated references to copying binaries and Python files >> throughout the PEP is annoying, and needs to be justified. Python 3.2+ >> supports symlinking on Windows Vista and above as well as on *nix >> systems, so there needs to be a short section somewhere explaining why >> symlinks are not an adequate lightweight solution (pointing out the >> fact that creating symlinks on Windows often requires administrator >> privileges would be sufficient justification for me). > > Windows Vista?!? ?Death first! ?;) > > My machines are still using XP -- is that sufficient justification for > copying? It's justification for supporting it, not necessarily for doing it implicitly (although Windows in general is far more tolerant of bundling than the various *nix platforms). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ron3200 at gmail.com Tue Oct 25 02:17:28 2011 From: ron3200 at gmail.com (Ron Adam) Date: Mon, 24 Oct 2011 19:17:28 -0500 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: <4EA5F680.4030003@pearwood.info> References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> <8739ekpupz.fsf@benfinney.id.au> <4EA3A4D6.3020204@pearwood.info> <4EA5F680.4030003@pearwood.info> Message-ID: <1319501848.26876.23.camel@Gutsy> On Tue, 2011-10-25 at 10:36 +1100, Steven D'Aprano wrote: > Guido van Rossum wrote: > > On Sat, Oct 22, 2011 at 10:23 PM, Steven D'Aprano wrote: > >> Nick Coghlan wrote: > >> [...] > >>> The current convention is that classes, functions and modules, don't > >>> offer a shorthand "pretty" display format at all. The proposal is to > >>> specifically bless "x.__name__" as an official shorthand. > >> I believe you are saying that backwards. > >> > >> The way to get the shorthand display format (i.e. the object's name) is > >> already to use x.__name__. There's no need for a proposal to bless > >> x.__name__ as the way to do it since that's already what people do. > >> > >> This proposal is to bless str(x) (and equivalent forms) as a shorthand for > >> x.__name__, and *unbless* x.__name__ as the official way to do it. I expect > >> that's what you mean. > > > > That is putting words in my mouth. There's no intent to unbless > > x.__name__, period. Also, there's no intent to bless str(x) as > > x.__name__ if you want the name. The only intent is to make what gets > > printed if you print a function, class or module to be less verbose. > > I'm sorry about that, I was describing the proposal as best I understood it. > > Once this change goes ahead, under what circumstances would you expect > people to continue using cls.__name__ (other than for backwards > compatibility)? When beginners ask me "how do I get the name of a > class?", what answer should I give? This isn't a big as change as you may be thinking of. name = cls.__name__ Won't change. You will still need to use the __name__ attribute to get a name in almost all situations except for when you want to place the name in a string or print it, you can then use just the cls, as the __str__ method would do the rest for you. > For example, I have code that does things like this: > > raise TypeError('expected a string but got %s' % type(arg).__name__) > > In the future, I expect that should be written like this: > > raise TypeError('expected a string but got %s' % type(arg)) Yes, it could be written that way, but it's not required. > That's all I meant by "unbless". I didn't mean to imply that __name__ > would go away, only that it would cease to be the One Obvious Way to get > the name. If I'm wrong about this, then I'm genuinely confused and don't > understand the motivation for this change. Printing the name of an exception, class, or function, is very common for logging and error messages, and this makes that easier and cleaner. And if a repr is wanted, we still have that also. Cheers, Ron From vinay_sajip at yahoo.co.uk Tue Oct 25 03:16:32 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 25 Oct 2011 02:16:32 +0100 (BST) Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: References: <4EA5AC93.2020305@oddbird.net> Message-ID: <1319505392.23141.YahooMailNeo@web25803.mail.ukl.yahoo.com> ----- Original Message ----- > From: Nick Coghlan > The repeated references to copying binaries and Python files > throughout the PEP is annoying, and needs to be justified. Python 3.2+ > supports symlinking on Windows Vista and above as well as on *nix > systems, so there needs to be a short section somewhere explaining why > symlinks are not an adequate lightweight solution (pointing out the > fact that creating symlinks on Windows often requires administrator > privileges would be sufficient justification for me). I agree that symlinking should be done wherever possible, but let's remember that it's not just a Python issue: Windows XP does not support true symlinks, but only "junctions" aka "reparse points". Of course, that's no reason not to use true symlinks where they *are* available. From brian.curtin at gmail.com Tue Oct 25 03:42:33 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Mon, 24 Oct 2011 20:42:33 -0500 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: <1319505392.23141.YahooMailNeo@web25803.mail.ukl.yahoo.com> References: <4EA5AC93.2020305@oddbird.net> <1319505392.23141.YahooMailNeo@web25803.mail.ukl.yahoo.com> Message-ID: On Mon, Oct 24, 2011 at 20:16, Vinay Sajip wrote: > ----- Original Message ----- > > > From: Nick Coghlan > > The repeated references to copying binaries and Python files > > throughout the PEP is annoying, and needs to be justified. Python 3.2+ > > supports symlinking on Windows Vista and above as well as on *nix > > systems, so there needs to be a short section somewhere explaining why > > symlinks are not an adequate lightweight solution (pointing out the > > fact that creating symlinks on Windows often requires administrator > > privileges would be sufficient justification for me). > > I agree that symlinking should be done wherever possible, but let's > remember that it's not just a Python issue: Windows XP does not support true > symlinks, but only "junctions" aka "reparse points". Of course, that's no > reason not to use true symlinks where they *are* available. On 3.2+, symlinks on Windows are possible but only in what I suspect is a pretty rare version and environment combination. As said earlier, they work on Vista and beyond, but they require that the process has been elevated in order to obtain the symlink privilege. For example, even as an administrator account on Windows 7, you need to explicitly open up cmd with "Run as Administrator" to properly symlink (through Python or otherwise). If we can't execute the symlink, we raise OSError. We might as well try symlinks then fallback on OSError. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Oct 25 04:27:46 2011 From: guido at python.org (Guido van Rossum) Date: Mon, 24 Oct 2011 19:27:46 -0700 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: <4EA5F680.4030003@pearwood.info> References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> <8739ekpupz.fsf@benfinney.id.au> <4EA3A4D6.3020204@pearwood.info> <4EA5F680.4030003@pearwood.info> Message-ID: On Mon, Oct 24, 2011 at 4:36 PM, Steven D'Aprano wrote: > Guido van Rossum wrote: >> That is putting words in my mouth. There's no intent to unbless >> x.__name__, period. Also, there's no intent to bless str(x) as >> x.__name__ if you want the name. The only intent is to make what gets >> printed if you print a function, class or module to be less verbose. > > I'm sorry about that, I was describing the proposal as best I understood it. Okay. > Once this change goes ahead, under what circumstances would you expect > people to continue using cls.__name__ (other than for backwards > compatibility)? Whenever they want to know what the name is, use the name to look something up, to compare it to a list of known names, you name it... > When beginners ask me "how do I get the name of a class?", > what answer should I give? Use .__name__, definitely. > For example, I have code that does things like this: > > raise TypeError('expected a string but got %s' % type(arg).__name__) > > In the future, I expect that should be written like this: > > raise TypeError('expected a string but got %s' % type(arg)) Yes, because here the point is to quickly put some useful info in a message. Likely the code *already* looks like the second form and we've just made it prettier. > That's all I meant by "unbless". I didn't mean to imply that __name__ would > go away, only that it would cease to be the One Obvious Way to get the name. > If I'm wrong about this, then I'm genuinely confused and don't understand > the motivation for this change. Depends on what you want to do with it. If you're just showing it to a user, str() is your friend. If you want to compute with it, parse it, etc., use .__name__. You may think this violates TOOWTDI, but as I've said before, that was a white lie (as well a cheeky response to Perl's slogan around 2000). Being able to express intent (to human readers) often requires choosing between multiple forms that do essentially the same thing, but look different to the reader. -- --Guido van Rossum (python.org/~guido) From greg.ewing at canterbury.ac.nz Tue Oct 25 07:16:42 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 25 Oct 2011 18:16:42 +1300 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: <1319505392.23141.YahooMailNeo@web25803.mail.ukl.yahoo.com> References: <4EA5AC93.2020305@oddbird.net> <1319505392.23141.YahooMailNeo@web25803.mail.ukl.yahoo.com> Message-ID: <4EA6463A.7070000@canterbury.ac.nz> Vinay Sajip wrote: > Windows XP does not support true symlinks, > but only "junctions" aka "reparse points". Out of curiosity, how far do these fall short of being true symlinks? The points I'm aware of are: * They only work within a volume * The GUI doesn't know about them, so it's easy to mistake a link to a folder for an independent copy of it and accidentally trash the original Are there any others? -- Greg From cmjohnson.mailinglist at gmail.com Tue Oct 25 08:50:42 2011 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Mon, 24 Oct 2011 20:50:42 -1000 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> <8739ekpupz.fsf@benfinney.id.au> <4EA3A4D6.3020204@pearwood.info> <4EA5F680.4030003@pearwood.info> Message-ID: <29121436-85DE-4426-92CE-9A0F167364DE@gmail.com> On Oct 24, 2011, at 4:27 PM, Guido van Rossum wrote: > You may think this violates TOOWTDI, but as I've said before, that was > a white lie (as well a cheeky response to Perl's slogan around 2000). > Being able to express intent (to human readers) often requires > choosing between multiple forms that do essentially the same thing, > but look different to the reader. So, you're saying TMTOWTDI and "different things should look different"? Someone send a letter of surrender to Larry Wall. ;-) From timothy.c.delaney at gmail.com Tue Oct 25 10:01:38 2011 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Tue, 25 Oct 2011 19:01:38 +1100 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: <4EA6463A.7070000@canterbury.ac.nz> References: <4EA5AC93.2020305@oddbird.net> <1319505392.23141.YahooMailNeo@web25803.mail.ukl.yahoo.com> <4EA6463A.7070000@canterbury.ac.nz> Message-ID: On 25 October 2011 16:16, Greg Ewing wrote: > Vinay Sajip wrote: > > Windows XP does not support true symlinks, >> but only "junctions" aka "reparse points". >> > > Out of curiosity, how far do these fall short of being > true symlinks? The points I'm aware of are: > > * They only work within a volume > * The GUI doesn't know about them, so it's easy to mistake > a link to a folder for an independent copy of it and > accidentally trash the original > > Are there any others? Junctions are only available for directories, and they're more akin in practice to hardlinks than symlinks (although different to both). Junctions can quite happily work cross-volume. It's hardlinks that only work within a volume (and you can't make a hardlink to a directory). Junctions are transparent within a network share i.e. if you a share that contains a junction, the junction will be traversed correctly. Symlinks do not work as you would expect when accessed via a network share. On Win7 and Vista, deleting a junction only deletes the junction. On XP deleting a junction deletes the contents as well unless Link Shell Extension is installed. Tim Delaney > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Oct 25 10:11:18 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 25 Oct 2011 18:11:18 +1000 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: References: <4EA5AC93.2020305@oddbird.net> <1319505392.23141.YahooMailNeo@web25803.mail.ukl.yahoo.com> Message-ID: On Tue, Oct 25, 2011 at 11:42 AM, Brian Curtin wrote: > On 3.2+, symlinks on Windows are possible but only in what I suspect is a > pretty rare version and environment combination. As said earlier, they work > on Vista and beyond, but they require that the process has been elevated in > order to obtain the symlink privilege. For example, even as an administrator > account on Windows 7, you need to explicitly open up cmd with "Run as > Administrator" to properly symlink (through Python or otherwise). > If we can't execute the symlink, we raise OSError. We might as well try > symlinks then fallback on OSError. I'd prefer some kind of warning if the symlink fails on a POSIX system. On Windows, bundling is such an accepted practice that even copying by default wouldn't bother me. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Tue Oct 25 10:11:57 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 25 Oct 2011 10:11:57 +0200 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib References: <4EA5AC93.2020305@oddbird.net> <4EA5E74B.2060906@stoneleaf.us> Message-ID: <20111025101157.7a02df7f@pitrou.net> On Tue, 25 Oct 2011 09:46:00 +1000 Nick Coghlan wrote: > On Tue, Oct 25, 2011 at 8:31 AM, Ethan Furman wrote: > > Nick Coghlan wrote: > >> > >> The repeated references to copying binaries and Python files > >> throughout the PEP is annoying, and needs to be justified. Python 3.2+ > >> supports symlinking on Windows Vista and above as well as on *nix > >> systems, so there needs to be a short section somewhere explaining why > >> symlinks are not an adequate lightweight solution (pointing out the > >> fact that creating symlinks on Windows often requires administrator > >> privileges would be sufficient justification for me). > > > > Windows Vista?!? ?Death first! ?;) > > > > My machines are still using XP -- is that sufficient justification for > > copying? > > It's justification for supporting it, not necessarily for doing it > implicitly (although Windows in general is far more tolerant of > bundling than the various *nix platforms). Isn't it enough to share the Python DLL? The small Python executable can be copied around. Regards Antoine. From p.f.moore at gmail.com Tue Oct 25 10:25:26 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 25 Oct 2011 09:25:26 +0100 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: <4EA6463A.7070000@canterbury.ac.nz> References: <4EA5AC93.2020305@oddbird.net> <1319505392.23141.YahooMailNeo@web25803.mail.ukl.yahoo.com> <4EA6463A.7070000@canterbury.ac.nz> Message-ID: On 25 October 2011 06:16, Greg Ewing wrote: > Vinay Sajip wrote: > >> Windows XP does not support true symlinks, >> but only "junctions" aka "reparse points". > > Out of curiosity, how far do these fall short of being > true symlinks? The points I'm aware of are: > > * They only work within a volume > * The GUI doesn't know about them, so it's easy to mistake > ?a link to a folder for an independent copy of it and > ?accidentally trash the original > > Are there any others? They are very uncommon on Windows in general, so people find them confusing when they encounter them (for example, as a consequence of the GUI issue mentioned above - I know that's happened to me). Is there any reason that hard links can't be used on Windows? (Still not cross-volume, still relatively rarely used, but a bit better known than symlinks). BTW, regardless of whether symlinks, hardlinks or copies end up being used, I'm +1 on the general proposal. Paul. From ncoghlan at gmail.com Tue Oct 25 10:28:01 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 25 Oct 2011 18:28:01 +1000 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: <20111025101157.7a02df7f@pitrou.net> References: <4EA5AC93.2020305@oddbird.net> <4EA5E74B.2060906@stoneleaf.us> <20111025101157.7a02df7f@pitrou.net> Message-ID: On Tue, Oct 25, 2011 at 6:11 PM, Antoine Pitrou wrote: > On Tue, 25 Oct 2011 09:46:00 +1000 > Nick Coghlan wrote: >> It's justification for supporting it, not necessarily for doing it >> implicitly (although Windows in general is far more tolerant of >> bundling than the various *nix platforms). > > Isn't it enough to share the Python DLL? The small Python executable > can be copied around. Yeah, I realised I don't actually mind if things get copied around on Windows - it's the POSIX systems where implicit copying would bother me, and that goes to the heart of a longstanding difference in packaging philosophy between the two platforms :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From carl at oddbird.net Tue Oct 25 19:37:44 2011 From: carl at oddbird.net (Carl Meyer) Date: Tue, 25 Oct 2011 11:37:44 -0600 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: References: <4EA5AC93.2020305@oddbird.net> Message-ID: <4EA6F3E8.60203@oddbird.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Nick, Thanks for the feedback, replies below. On 10/24/2011 04:11 PM, Nick Coghlan wrote: > The repeated references to copying binaries and Python files > throughout the PEP is annoying, and needs to be justified. Python 3.2+ > supports symlinking on Windows Vista and above as well as on *nix > systems, so there needs to be a short section somewhere explaining why > symlinks are not an adequate lightweight solution (pointing out the > fact that creating symlinks on Windows often requires administrator > privileges would be sufficient justification for me). Do you mean pointing out why symlinks are not in themselves an adequate solution for virtual Python environments, or why they aren't used in this implementation? I've updated the introductory paragraph you quoted to provide a bit more detail on the first question. I think the answer to the latter question is just that we were trying to keep things simple and consistent, and make it work the same way on as wide a variety of platforms as possible (e.g. Windows XP). Also, in earlier discussions on distutils-sig some people considered it a _feature_ to have the virtual environment's Python binary copied, making the virtual environment more isolated from system changes. Obviously, this is an area where programmers and sysadmins often don't see eye to eye ;) The technique in this PEP works just as well with a symlinked binary, though, and I don't see much reason not to provide a symlink option. Whether it is on by default where supported is something that may need more discussion (I don't personally have a strong opinion either way). The reason virtualenv copies rather than symlinks the binary has nothing to do with lack of symlink support in Python 2, it's because getpath.c dereferences a symlinked binary in finding sys.prefix, so virtualenv's isolation technique simply doesn't work at all with a symlinked binary. Our version with the config file and Vinay's changes in getpath.c does work with a symlinked binary. >> A virtual environment mechanism integrated with Python and drawing on >> years of experience with existing third-party tools can be lower >> maintenance, more reliable, and more easily available to all Python >> users. > > It can also take advantage of the native symlink support to minimise copying. I don't think this is a significant enough difference to warrant mention here. Existing virtualenv on Python 2 already can and does use symlinks (for the bits of the stdlib it needs) on platforms that have os.symlink. IOW, the details and supported platforms differ, but on both Python 2 and 3 the best you can do is try to use os.symlink where available, and be prepared to fall back to a copy. > Currently, the PEP uses a mish-mash of 'env', 'venv', 'pyvenv' and > 'pythonv' and 'site' to refer to different aspects of the proposed > feature. I suggest standardising on 'venv' wherever the Python > relationship is implied, and 'pyvenv' wherever the Python relationship > needs to be made explicit. Good point. We were already attempting to standardize just as you suggest, but hadn't renamed "env.cfg", and I missed one remaining instance of "pythonv" in the text (it's also used for legacy reasons in the reference implementation bitbucket repo name, but that doesn't seem worth changing). > So I think the name of the configuration file should be "pyvenv.cfg" > (tangent: 'setup' lost its Python connection when it went from > 'setup.py' to 'setup.cfg'. I wish the latter has been called > 'pysetup.cfg' instead) This makes sense to me. I've updated the PEP accordingly and created an issue to remind me to update the reference implementation as well. > 'site' is *way* too overloaded already, let's not make it worse. I > suggest "sys.venv_prefix". My original thinking here was that sys.site_prefix is an attribute that should always exist, and always point to "where stuff should be installed to site-packages", whether or not you are in a venv (if you are not, it would have the same value as sys.prefix). It's a little odd to use an attribute named "sys.venv_prefix" in that way, even if your code doesn't know or care whether its actually in a venv (and in general we should be encouraging code that doesn't know or care). (The attribute doesn't currently always-exist in the reference implementation, but I'd like to change that). I agree that "site" is overloaded, though. Any ideas for a name that doesn't further overload that term, but still communicates "this attribute is a standard part of Python that always has the same meaning whether or not you are currently in a venv"? >> env.cfg >> bin/python3 >> lib/python3.3/site-packages/ > > The builtin virtual environment mechanism should be specified to > symlink things by default, and only copy things if the user > specifically requests it. To be clear, "things" here refers only to the Python binary itself. The only other things that might be installed in a new environment are scripts (e.g. pysetup3), and those must be created anew, neither symlinked nor copied, as their shebang line needs to point to the venv's Python (or more complicated chicanery with .exe wrappers on Windows, unless we get PEP 397 in time). > System administrators rightly fear the > proliferation of multiple copies of binaries, since it can cause major > hassles when it comes time to install security updates. I think both options should be allowed, and I don't have a strong feeling about the default. >> Isolation from system site-packages >> - ----------------------------------- >> >> In a virtual environment, the ``site`` module will normally still add >> the system site directories to ``sys.path`` after the virtual >> environment site directories. Thus system-installed packages will >> still be importable, but a package of the same name installed in the >> virtual environment will take precedence. >> >> If the ``env.cfg`` file also contains a key ``include-system-site`` >> with a value of ``false`` (not case sensitive), the ``site`` module >> will omit the system site directories entirely. This allows the >> virtual environment to be entirely isolated from system site-packages. > > "site" is ambiguous here - rather than abbreviating, I suggest making > the option "include-system-site-packages". Ok - updated the draft PEP, will update reference implementation. > Also, why "Scripts" rather than "bin" on Windows? The Python binary > isn't a script. No, but in real-world usage, scripts from installed packages in the virtualenv will be installed there. Putting the Python binary in the same location as the destination for installed scripts is a pretty important convenience, as it means you only need to add a single directory to the beginning of your shell path to effectively "activate" the venv. > I'm actually not seeing the rationale for the obfuscated FHS inspired > layout in the first place - why not dump the binaries adjacent to the > config file, with a simple "site-packages" directory immediately below > that? If there are reasons for a more complex default layout, they > need to be articulated in the PEP. The historical reason is that it emulates the layout found under sys.prefix in a regular Python installation (note that it's not actually Linux-specific, in virtualenv it matches the appropriate platform; i.e. on Windows it's "Lib\" rather than "lib\pythonX.X"). This was necessary for virtualenv because it couldn't make changes in distutils/sysconfig). I think there may be good reason to continue to follow this approach, simply because it makes the necessary changes to distutils/sysconfig less invasive, reducing the need for special-casing of the venv case. But I do need to look into this a bit more and update the PEP with further rationale in any case. Regardless, I would not be in favor of dumping binaries directly next to pyvenv.cfg. It feels cleaner to keep scripts and binaries in a directory specifically named and intended for that purpose, which can be added to the shell PATH. I also think there is some value, all else being roughly equal, in maintaining consistency with virtualenv's layout. This is not an overriding concern, but it will make a big difference in how much existing code that deals with virtual environments has to change. > If the problem is wanting to allow cross platform computation of > things like the site-packages directory location and other paths, then > the answer to that seems to lie in better helper methods (whether in > sysconfig, site, venv or elsewhere) rather than Linux specific layouts > inside language level virtual environments. >> The ``venv`` module will contain an ``EnvBuilder`` class which accepts >> the following keyword arguments on instantiation:: >> >> * ``nosite`` - A Boolean value indicating that isolation of the >> environment from the system Python is required (defaults to >> ``False``). > > Yikes, double negatives in APIs are bad news (especially when the > corresponding config file option is expressed positively) > > I suggest this parameter should be declared as "system_site_packages=True". Fair enough, updated in draft PEP. >> * ``clear`` - A Boolean value which, if True, will delete any >> existing target directory instead of raising an exception >> (defaults to ``False``). > > In line with my above comments, I think there should be a third > parameter here declared as "use_symlinks=True". Thanks again for the review! The updated draft is available on Bitbucket [1], and the open issues for the reference implementation (which should reflect outstanding differences from the draft PEP) are as well [2]. Carl [1] https://bitbucket.org/carljm/pythonv-pep/src [2] https://bitbucket.org/vinay.sajip/pythonv/issues?status=new&status=open -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6m8+gACgkQ8W4rlRKtE2f1EgCfZNLBXSI08UQdLCRQMYwxwAp3 ByoAn3cVYvQXWMc1xkoO6mMSmNBQbEAD =FzBA -----END PGP SIGNATURE----- From vinay_sajip at yahoo.co.uk Tue Oct 25 19:45:18 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 25 Oct 2011 18:45:18 +0100 (BST) Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: <4EA6F3E8.60203@oddbird.net> References: <4EA5AC93.2020305@oddbird.net> <4EA6F3E8.60203@oddbird.net> Message-ID: <1319564718.35001.YahooMailNeo@web25803.mail.ukl.yahoo.com> > > The updated draft is available on Bitbucket [1], and the open issues for > the reference implementation (which should reflect outstanding > differences from the draft PEP) are as well [2]. > I have updated the reference implementation to resolve these issues (bar one, which does not relate to the review comments). Regards, Vinay Sajip From p.f.moore at gmail.com Tue Oct 25 20:11:40 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 25 Oct 2011 19:11:40 +0100 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: <4EA6F3E8.60203@oddbird.net> References: <4EA5AC93.2020305@oddbird.net> <4EA6F3E8.60203@oddbird.net> Message-ID: On 25 October 2011 18:37, Carl Meyer wrote: > On 10/24/2011 04:11 PM, Nick Coghlan wrote: [...] >>> ? ?env.cfg >>> ? ?bin/python3 >>> ? ?lib/python3.3/site-packages/ >> >> The builtin virtual environment mechanism should be specified to >> symlink things by default, and only copy things if the user >> specifically requests it. > > To be clear, "things" here refers only to the Python binary itself. The > only other things that might be installed in a new environment are > scripts (e.g. pysetup3), and those must be created anew, neither > symlinked nor copied, as their shebang line needs to point to the venv's > Python (or more complicated chicanery with .exe wrappers on Windows, > unless we get PEP 397 in time). One thought - on "per user" Windows installations, or uninstalled Python builds on Windows, IIRC python.exe requires pythonXY.dll to be alongside the EXE. You should probably consider copying/linking that if it's not installed systemwide. It's not entirely obvious to me how you'd detect this, though - maybe take the simple approach of copying python.exe, and also copying pythonXY.dll if it's in the same directory otherwise assume it's a system install. There's also pythonw.exe on Windows, of course... (I'm assuming that w9xpopen.exe is probably not needed - I'm not even sure there's any remaining need to ship it with Python in the first place). Paul. From carl at oddbird.net Tue Oct 25 20:38:44 2011 From: carl at oddbird.net (Carl Meyer) Date: Tue, 25 Oct 2011 12:38:44 -0600 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: References: <4EA5AC93.2020305@oddbird.net> <4EA6F3E8.60203@oddbird.net> Message-ID: <4EA70234.40708@oddbird.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Paul, On 10/25/2011 12:11 PM, Paul Moore wrote: > One thought - on "per user" Windows installations, or uninstalled > Python builds on Windows, IIRC python.exe requires pythonXY.dll to be > alongside the EXE. You should probably consider copying/linking that > if it's not installed systemwide. It's not entirely obvious to me how > you'd detect this, though - maybe take the simple approach of copying > python.exe, and also copying pythonXY.dll if it's in the same > directory otherwise assume it's a system install. > > There's also pythonw.exe on Windows, of course... (I'm assuming that > w9xpopen.exe is probably not needed - I'm not even sure there's any > remaining need to ship it with Python in the first place). Actually, the reference implementation does symlink/copy all DLLs from the same directory as the python binary or a DLLs/ subdirectory, and also any executable beginning with "python", which would include pythonw.exe. We should certainly update the PEP with more detail about how things look different on Windows, currently it's pretty much describing the POSIX behavior (my fault; Vinay's done all the Windows work and I'm not as familiar with it). Carl -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6nAjQACgkQ8W4rlRKtE2dF4gCfaBpwyQArhArI6ybo9c6Qbjsz mekAn3/LnNHnt3U79ZA6GIyZXxxE4Ja2 =whkG -----END PGP SIGNATURE----- From barry at python.org Tue Oct 25 22:34:31 2011 From: barry at python.org (Barry Warsaw) Date: Tue, 25 Oct 2011 16:34:31 -0400 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: <4EA6F3E8.60203@oddbird.net> References: <4EA5AC93.2020305@oddbird.net> <4EA6F3E8.60203@oddbird.net> Message-ID: <20111025163431.7559b8ec@resist.wooz.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On Oct 25, 2011, at 11:37 AM, Carl Meyer wrote: >Also, in earlier discussions on distutils-sig some people considered it >a _feature_ to have the virtual environment's Python binary copied, >making the virtual environment more isolated from system changes. >Obviously, this is an area where programmers and sysadmins often don't >see eye to eye ;) > >The technique in this PEP works just as well with a symlinked binary, >though, and I don't see much reason not to provide a symlink option. >Whether it is on by default where supported is something that may need >more discussion (I don't personally have a strong opinion either way). I definitely think folks will want copy and symlink options, for exactly the reason you and Nick point out. As for what the default should be, I think maximal isolation probably wins, and thus copying should be the default. I wouldn't scream if it were the other way 'round though. >> 'site' is *way* too overloaded already, let's not make it worse. I >> suggest "sys.venv_prefix". > >My original thinking here was that sys.site_prefix is an attribute that >should always exist, and always point to "where stuff should be >installed to site-packages", whether or not you are in a venv (if you >are not, it would have the same value as sys.prefix). It's a little odd >to use an attribute named "sys.venv_prefix" in that way, even if your >code doesn't know or care whether its actually in a venv (and in general >we should be encouraging code that doesn't know or care). (The attribute >doesn't currently always-exist in the reference implementation, but I'd >like to change that). That makes sense to me. It would be nice to have the location of your site-prefix always available for calculations regardless of your "virtualness". Note that in all likelihood, Debian will have to modify that so that it points to dist-packages instead of site-packages. Will there be an easy hook/config file or some such that we could use to make that change? >I agree that "site" is overloaded, though. Any ideas for a name that >doesn't further overload that term, but still communicates "this >attribute is a standard part of Python that always has the same meaning >whether or not you are currently in a venv"? sys.addons_directory, sys.plugins_directory, sys.external_packages or some such? >No, but in real-world usage, scripts from installed packages in the >virtualenv will be installed there. Putting the Python binary in the >same location as the destination for installed scripts is a pretty >important convenience, as it means you only need to add a single >directory to the beginning of your shell path to effectively "activate" >the venv. +1 >> I'm actually not seeing the rationale for the obfuscated FHS inspired >> layout in the first place - why not dump the binaries adjacent to the >> config file, with a simple "site-packages" directory immediately below >> that? If there are reasons for a more complex default layout, they >> need to be articulated in the PEP. > >The historical reason is that it emulates the layout found under >sys.prefix in a regular Python installation (note that it's not actually >Linux-specific, in virtualenv it matches the appropriate platform; i.e. >on Windows it's "Lib\" rather than "lib\pythonX.X"). This was necessary >for virtualenv because it couldn't make changes in distutils/sysconfig). >I think there may be good reason to continue to follow this approach, >simply because it makes the necessary changes to distutils/sysconfig >less invasive, reducing the need for special-casing of the venv case. >But I do need to look into this a bit more and update the PEP with >further rationale in any case. > >Regardless, I would not be in favor of dumping binaries directly next to >pyvenv.cfg. It feels cleaner to keep scripts and binaries in a directory >specifically named and intended for that purpose, which can be added to >the shell PATH. > >I also think there is some value, all else being roughly equal, in >maintaining consistency with virtualenv's layout. This is not an >overriding concern, but it will make a big difference in how much >existing code that deals with virtual environments has to change. All of that makes sense to me, so adding these rationales to the PEP would be useful. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAEBCAAGBQJOpx1XAAoJEBJutWOnSwa/I2YP/jXyJCu5uIVp20H+SjlqlnmB y5tbzgSTPi3Yf1In1+bs1bPdNhqlZYi1WfmGRPA7QfXWK4oeSpPXTjs/bAaTYmPE GjWDvQKue1hNv8qdFbo0IyI3u+2Zq0KB3sC+ACxB8WZ/Xc9wt4q47zMOTods4/Zj ct5nvsib6xF/ZGk2BJopEQwFHhraxuNkrhSsgbUUsQJ3XEmVtLQZ4Dyr5gxRDmOF /oShU556C3JjyPsXkNmURjQgXsbsWN/7j815yVpsLWNm5fjCPjqJAjY64RBFe9l/ EJqutLvqfQJJ1ie/R8xBBLF1LWNzzOeWgcgenbrdlgXa7xuGQE8psb1CSxdEUgec 9CUqsZbXDhQDtgdIcqsemhtGSIrl8UtCyrIDj7HUqK6ibOLTCa83AztpyupXpcki ynTwJLdSYRc+8IH6kLQ5DpqYCYC+14jEoMrDSSRGB/EyAE1P0l3DE710JtYHV12R 6e0Ciw8qjuhUTnmAr6H5DnT8UaWXENBDkDbel6ABHxEKalavAnr3kzjUTLcnTwMJ QfrbcVDBRnGV1swelLZ3oXitgofCCaz7Ca4kwqHikUQ5R/iPNl5AJyyJenFHsPrc J0jfSQEqPP3FOO27atPlVy/9lE45Ac0D7UcMq5IrfruLd4Rw7P50sPIxa+anAfKK GEuw81sCOyyJ4PU0cfxw =KObT -----END PGP SIGNATURE----- From ncoghlan at gmail.com Wed Oct 26 02:47:52 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 26 Oct 2011 10:47:52 +1000 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: <4EA6F3E8.60203@oddbird.net> References: <4EA5AC93.2020305@oddbird.net> <4EA6F3E8.60203@oddbird.net> Message-ID: On Wed, Oct 26, 2011 at 3:37 AM, Carl Meyer wrote: > Also, in earlier discussions on distutils-sig some people considered it > a _feature_ to have the virtual environment's Python binary copied, > making the virtual environment more isolated from system changes. > Obviously, this is an area where programmers and sysadmins often don't > see eye to eye ;) I've been persauded that copying by default is reasonable - so long as the option to use symlinks is there, sysadmins can put a policy for their organisation in place that requires its use. >> 'site' is *way* too overloaded already, let's not make it worse. I >> suggest "sys.venv_prefix". > > My original thinking here was that sys.site_prefix is an attribute that > should always exist, and always point to "where stuff should be > installed to site-packages", whether or not you are in a venv (if you > are not, it would have the same value as sys.prefix). It's a little odd > to use an attribute named "sys.venv_prefix" in that way, even if your > code doesn't know or care whether its actually in a venv (and in general > we should be encouraging code that doesn't know or care). (The attribute > doesn't currently always-exist in the reference implementation, but I'd > like to change that). > > I agree that "site" is overloaded, though. Any ideas for a name that > doesn't further overload that term, but still communicates "this > attribute is a standard part of Python that always has the same meaning > whether or not you are currently in a venv"? I'd actually prefer that we use the explicit "sys.prefix" and "sys.venv_prefix" naming (with the latter set to None when not in a virtual env) and possibly offer a convenience API somewhere that hides the "sys.prefix if sys.venv_prefix is None else sys.venv_prefix" dance. There's already a site.getsitepackage() API that gets the full list of site package directories (it's not a given that there's only one). >> I'm actually not seeing the rationale for the obfuscated FHS inspired >> layout in the first place - why not dump the binaries adjacent to the >> config file, with a simple "site-packages" directory immediately below >> that? If there are reasons for a more complex default layout, they >> need to be articulated in the PEP. > > The historical reason is that it emulates the layout found under > sys.prefix in a regular Python installation (note that it's not actually > Linux-specific, in virtualenv it matches the appropriate platform; i.e. > on Windows it's "Lib\" rather than "lib\pythonX.X"). This was necessary > for virtualenv because it couldn't make changes in distutils/sysconfig). > I think there may be good reason to continue to follow this approach, > simply because it makes the necessary changes to distutils/sysconfig > less invasive, reducing the need for special-casing of the venv case. > But I do need to look into this a bit more and update the PEP with > further rationale in any case. > > Regardless, I would not be in favor of dumping binaries directly next to > pyvenv.cfg. It feels cleaner to keep scripts and binaries in a directory > specifically named and intended for that purpose, which can be added to > the shell PATH. > > I also think there is some value, all else being roughly equal, in > maintaining consistency with virtualenv's layout. This is not an > overriding concern, but it will make a big difference in how much > existing code that deals with virtual environments has to change. > >> If the problem is wanting to allow cross platform computation of >> things like the site-packages directory location and other paths, then >> the answer to that seems to lie in better helper methods (whether in >> sysconfig, site, venv or elsewhere) rather than Linux specific layouts >> inside language level virtual environments. As Barry said, the rationale sounds reasonable, but the PEP needs to explain it in terms those of us not fully versed in the details of cross-platform virtual environments can understand (examples of both *nix and Windows layouts with a simple example package installed would probably be useful on that front) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From vinay_sajip at yahoo.co.uk Wed Oct 26 03:15:39 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Wed, 26 Oct 2011 01:15:39 +0000 (UTC) Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib References: <4EA5AC93.2020305@oddbird.net> <4EA6F3E8.60203@oddbird.net> Message-ID: Nick Coghlan writes: > > I'd actually prefer that we use the explicit "sys.prefix" and > "sys.venv_prefix" naming (with the latter set to None when not in a > virtual env) and possibly offer a convenience API somewhere that hides > the "sys.prefix if sys.venv_prefix is None else sys.venv_prefix" > dance. > But why is that better than a site.venv_prefix which points to a venv if you're in one, and == sys.prefix if you're not? Regards, Vinay Sajip From carl at oddbird.net Wed Oct 26 03:16:22 2011 From: carl at oddbird.net (Carl Meyer) Date: Tue, 25 Oct 2011 19:16:22 -0600 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: References: <4EA5AC93.2020305@oddbird.net> <4EA6F3E8.60203@oddbird.net> Message-ID: <4EA75F66.8060903@oddbird.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Nick, On 10/25/2011 06:47 PM, Nick Coghlan wrote: > I've been persauded that copying by default is reasonable - so long as > the option to use symlinks is there, sysadmins can put a policy for > their organisation in place that requires its use. Sure - the symlink option is now there in both PEP and impl. Vinay actually flipped it on by default for platforms where it's reliable, but if you and Barry are both now leaning towards leaving it off by default everywhere, that's equally fine by me. We'll flip it off by default. >>> 'site' is *way* too overloaded already, let's not make it worse. I >>> suggest "sys.venv_prefix". >> >> My original thinking here was that sys.site_prefix is an attribute that >> should always exist, and always point to "where stuff should be >> installed to site-packages", whether or not you are in a venv (if you >> are not, it would have the same value as sys.prefix). It's a little odd >> to use an attribute named "sys.venv_prefix" in that way, even if your >> code doesn't know or care whether its actually in a venv (and in general >> we should be encouraging code that doesn't know or care). (The attribute >> doesn't currently always-exist in the reference implementation, but I'd >> like to change that). >> >> I agree that "site" is overloaded, though. Any ideas for a name that >> doesn't further overload that term, but still communicates "this >> attribute is a standard part of Python that always has the same meaning >> whether or not you are currently in a venv"? > > I'd actually prefer that we use the explicit "sys.prefix" and > "sys.venv_prefix" naming (with the latter set to None when not in a > virtual env) and possibly offer a convenience API somewhere that hides > the "sys.prefix if sys.venv_prefix is None else sys.venv_prefix" > dance. > > There's already a site.getsitepackage() API that gets the full list of > site package directories (it's not a given that there's only one). This is true. I guess it mostly boils down to making the code in sysconfig less ugly; any external tool should be using sysconfig/site APIs anyway, not doing custom stuff with sys.*_prefix attributes to find site-packages. But a helper function can take care of that ugliness in sysconfig, so I'm ok with your approach to the sys attributes. (Though we still have to pick a name to refer to "sys.venv_prefix if sys.venv_prefix else sys.prefix", it just pushes the naming question onto the helper function :P). >>> I'm actually not seeing the rationale for the obfuscated FHS inspired >>> layout in the first place - why not dump the binaries adjacent to the >>> config file, with a simple "site-packages" directory immediately below >>> that? If there are reasons for a more complex default layout, they >>> need to be articulated in the PEP. >> >> The historical reason is that it emulates the layout found under >> sys.prefix in a regular Python installation (note that it's not actually >> Linux-specific, in virtualenv it matches the appropriate platform; i.e. >> on Windows it's "Lib\" rather than "lib\pythonX.X"). This was necessary >> for virtualenv because it couldn't make changes in distutils/sysconfig). >> I think there may be good reason to continue to follow this approach, >> simply because it makes the necessary changes to distutils/sysconfig >> less invasive, reducing the need for special-casing of the venv case. >> But I do need to look into this a bit more and update the PEP with >> further rationale in any case. >> >> Regardless, I would not be in favor of dumping binaries directly next to >> pyvenv.cfg. It feels cleaner to keep scripts and binaries in a directory >> specifically named and intended for that purpose, which can be added to >> the shell PATH. >> >> I also think there is some value, all else being roughly equal, in >> maintaining consistency with virtualenv's layout. This is not an >> overriding concern, but it will make a big difference in how much >> existing code that deals with virtual environments has to change. >> >>> If the problem is wanting to allow cross platform computation of >>> things like the site-packages directory location and other paths, then >>> the answer to that seems to lie in better helper methods (whether in >>> sysconfig, site, venv or elsewhere) rather than Linux specific layouts >>> inside language level virtual environments. > > As Barry said, the rationale sounds reasonable, but the PEP needs to > explain it in terms those of us not fully versed in the details of > cross-platform virtual environments can understand (examples of both > *nix and Windows layouts with a simple example package installed would > probably be useful on that front) Yep, adding more Windows examples and generally a fuller rationale for the layout to the PEP are next on my list. Carl -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6nX2YACgkQ8W4rlRKtE2dCWgCePm8irdm52Mz2klfBJBY/DwCt 0bEAn32tOMwjYQmrwZiDEtv2YSkE82Ff =WKEK -----END PGP SIGNATURE----- From carl at oddbird.net Wed Oct 26 03:21:53 2011 From: carl at oddbird.net (Carl Meyer) Date: Tue, 25 Oct 2011 19:21:53 -0600 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: <20111025163431.7559b8ec@resist.wooz.org> References: <4EA5AC93.2020305@oddbird.net> <4EA6F3E8.60203@oddbird.net> <20111025163431.7559b8ec@resist.wooz.org> Message-ID: <4EA760B1.30900@oddbird.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Barry, Thanks for the review and feedback. On 10/25/2011 02:34 PM, Barry Warsaw wrote: >> My original thinking here was that sys.site_prefix is an attribute that >> should always exist, and always point to "where stuff should be >> installed to site-packages", whether or not you are in a venv (if you >> are not, it would have the same value as sys.prefix). It's a little odd >> to use an attribute named "sys.venv_prefix" in that way, even if your >> code doesn't know or care whether its actually in a venv (and in general >> we should be encouraging code that doesn't know or care). (The attribute >> doesn't currently always-exist in the reference implementation, but I'd >> like to change that). > > That makes sense to me. It would be nice to have the location of your > site-prefix always available for calculations regardless of your > "virtualness". Note that in all likelihood, Debian will have to modify that > so that it points to dist-packages instead of site-packages. Will there be an > easy hook/config file or some such that we could use to make that change? As it currently stands, this is really just the/a _prefix_ under which site-packages is found (parallel to how sys.prefix is currently used). Which means "site-packages" vs "dist-packages" is still determined in site.py, which tacks platform-specific paths onto the end of the prefix in order to find the actual site-packages directory. So in the Debian case, you'd still have to modify your distributed site.py (and take a bit of care, assisted by the tests, that you don't break venv.py in the process); I don't think this PEP really changes that much. (Though the fact that not breaking virtual environments when you make distro-specific modifications to site.py becomes your responsibility as the distro Python maintainer rather than mine as virtualenv maintainer is, in my humble opinion, one of the greatest advantages of this PEP ) Carl -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6nYLEACgkQ8W4rlRKtE2ftGgCfQr0m9WM5BtDxFZfqepbxqqxn +kIAniTBdvxWh/Bsg4rOEdGMYnIIpKmD =CMax -----END PGP SIGNATURE----- From tjreedy at udel.edu Wed Oct 26 03:26:07 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 25 Oct 2011 21:26:07 -0400 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: References: <4EA5AC93.2020305@oddbird.net> <4EA5E74B.2060906@stoneleaf.us> <20111025101157.7a02df7f@pitrou.net> Message-ID: On 10/25/2011 4:28 AM, Nick Coghlan wrote: > Yeah, I realised I don't actually mind if things get copied around on > Windows - it's the POSIX systems where implicit copying would bother > me, and that goes to the heart of a longstanding difference in > packaging philosophy between the two platforms :) I have a different issue with copying. I have a new Win7 system with a solid state disk for programs (that usually sit unchanged months or years at a time). It is nice and fast, but has a finite write-cycle lifetime. So unnecessary copies are not nice. I presume venv could be told to copy onto the data disk, but then everything runs slower. -- Terry Jan Reedy From barry at python.org Wed Oct 26 03:48:41 2011 From: barry at python.org (Barry Warsaw) Date: Tue, 25 Oct 2011 21:48:41 -0400 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib References: <4EA5AC93.2020305@oddbird.net> <4EA6F3E8.60203@oddbird.net> <20111025163431.7559b8ec@resist.wooz.org> <4EA760B1.30900@oddbird.net> Message-ID: <20111025214841.377699dc@resist.wooz.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On Oct 25, 2011, at 07:21 PM, Carl Meyer wrote: >As it currently stands, this is really just the/a _prefix_ under which >site-packages is found (parallel to how sys.prefix is currently used). >Which means "site-packages" vs "dist-packages" is still determined in >site.py, which tacks platform-specific paths onto the end of the prefix >in order to find the actual site-packages directory. So in the Debian >case, you'd still have to modify your distributed site.py (and take a >bit of care, assisted by the tests, that you don't break venv.py in the >process); I don't think this PEP really changes that much. > >(Though the fact that not breaking virtual environments when you make >distro-specific modifications to site.py becomes your responsibility as >the distro Python maintainer rather than mine as virtualenv maintainer >is, in my humble opinion, one of the greatest advantages of this PEP ) Yes, fair enough! :) - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAEBCAAGBQJOp2b5AAoJEBJutWOnSwa/irAQAMN+ZtkWAVRdcs9XosYyWsaT gs/BiAMJ2pob0WZ+U0/8BDM5/uz7UYoPqzmn6EzDKKA+ezhJ8UNYwWTkQoUpuQK3 OqTAA06qJ3oVR+BYoC8kzriTWNxuJ6jCdJbRrK8PXTz3FBdhw720mHO1A5k3ZmEA 2HebsbO7X11iU5JyNV4+K98DzUna/RnLvT3RUm8NPjUpBIS5QHhnVpmFEHrEy1F0 QnyZXbY7UqpUxXtBY6Q1sHoERRmZ2EZOeHwzBMC0iUcig3Pmps+taxHqXxhPh72F cNSt76K/ceHlNQrB8pBKQnP2FUn5KrZ8ut+LNHQmP6LYSZ/2r0kExbbqzF7xIxce ACfIl1H/ezzA7KK29WlKMgqMU6K/OL2pcdtZry35wKCA1RIghpds44iEV26yqWPK EY428lmWrkPZ2YGsfGyrm5E5vOzWEmcp7ze9FY34vgiphMzAkl1opCqKRffSc+ci dzmmjAOot+J6jVX0zAPqwI03Uys8/HJZwY8kK5VyKJon5EVI5FI9C/gqBDwR/2vP QZzdnlswwgGSz9xclzOhPW66lCZ5duNC115evNBQ030zoQRnGNICc+SNRB2GYY2v gzutj9KT2sB+HuvQagQfk8zuLs5pCSd9jLxXb5JPxLb8NCJsxHG8LrZfm+HiVD58 QOUkGdfxEDv76AD4f6mu =psL3 -----END PGP SIGNATURE----- From barry at python.org Wed Oct 26 03:51:39 2011 From: barry at python.org (Barry Warsaw) Date: Tue, 25 Oct 2011 21:51:39 -0400 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib References: <4EA5AC93.2020305@oddbird.net> <4EA6F3E8.60203@oddbird.net> Message-ID: <20111025215139.4906418a@resist.wooz.org> On Oct 26, 2011, at 01:15 AM, Vinay Sajip wrote: >Nick Coghlan writes: >> >> I'd actually prefer that we use the explicit "sys.prefix" and >> "sys.venv_prefix" naming (with the latter set to None when not in a >> virtual env) and possibly offer a convenience API somewhere that hides >> the "sys.prefix if sys.venv_prefix is None else sys.venv_prefix" >> dance. > >But why is that better than a site.venv_prefix which points to a venv if >you're in one, and == sys.prefix if you're not? I'm not sure why either , but I prefer the original suggestion, as Vinay restates it here. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From ncoghlan at gmail.com Wed Oct 26 04:05:31 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 26 Oct 2011 12:05:31 +1000 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: <20111025215139.4906418a@resist.wooz.org> References: <4EA5AC93.2020305@oddbird.net> <4EA6F3E8.60203@oddbird.net> <20111025215139.4906418a@resist.wooz.org> Message-ID: On Wed, Oct 26, 2011 at 11:51 AM, Barry Warsaw wrote: > On Oct 26, 2011, at 01:15 AM, Vinay Sajip wrote: > >>Nick Coghlan writes: >>> >>> I'd actually prefer that we use the explicit "sys.prefix" and >>> "sys.venv_prefix" naming (with the latter set to None when not in a >>> virtual env) and possibly offer a convenience API somewhere that hides >>> the "sys.prefix if sys.venv_prefix is None else sys.venv_prefix" >>> dance. >> >>But why is that better than a site.venv_prefix which points to a venv if >>you're in one, and == sys.prefix if you're not? > > I'm not sure why either , but I prefer the original suggestion, as Vinay > restates it here. Yeah, having venv_prefix == prefix in the "not in a virtual env case" is fine by me as well. I think Carl's right that it reads a little oddly sometimes, but it's a better option than: - further overloading "site" (when more than site-package may be located using the venv prefix) - requiring people to fall back to sys.prefix explicitly The "am I in a virtual env?" check can then just be "if sys.prefix == sys.venv_prefix". Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From carl at oddbird.net Wed Oct 26 04:14:11 2011 From: carl at oddbird.net (Carl Meyer) Date: Tue, 25 Oct 2011 20:14:11 -0600 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: References: <4EA5AC93.2020305@oddbird.net> <4EA6F3E8.60203@oddbird.net> <20111025215139.4906418a@resist.wooz.org> Message-ID: <4EA76CF3.1090900@oddbird.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/25/2011 08:05 PM, Nick Coghlan wrote: > On Wed, Oct 26, 2011 at 11:51 AM, Barry Warsaw wrote: >> On Oct 26, 2011, at 01:15 AM, Vinay Sajip wrote: >>> Nick Coghlan writes: >>>> I'd actually prefer that we use the explicit "sys.prefix" and >>>> "sys.venv_prefix" naming (with the latter set to None when not in a >>>> virtual env) and possibly offer a convenience API somewhere that hides >>>> the "sys.prefix if sys.venv_prefix is None else sys.venv_prefix" >>>> dance. >>> >>> But why is that better than a site.venv_prefix which points to a venv if >>> you're in one, and == sys.prefix if you're not? >> >> I'm not sure why either , but I prefer the original suggestion, as Vinay >> restates it here. > > Yeah, having venv_prefix == prefix in the "not in a virtual env case" > is fine by me as well. I think Carl's right that it reads a little > oddly sometimes, but it's a better option than: > - further overloading "site" (when more than site-package may be > located using the venv prefix) > - requiring people to fall back to sys.prefix explicitly What about "sys.local_prefix"? Doesn't overload site, but also doesn't imply that it always points to a venv. I think "local" carries the right connotation here - this is the prefix for the user's local environment (as possibly opposed to the "global" system environment). Seems like this might make code using the attribute unconditionally read a bit less oddly? > The "am I in a virtual env?" check can then just be "if sys.prefix == > sys.venv_prefix". Right; though in general the goal is that by simply switching to use the new prefix in the right places in sysconfig, such an explicit "am in a venv" check should never be necessary, even in the stdlib. I think the reference implementation largely achieves this goal, I'd have to check again to see if that's entirely true. Carl -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6nbPMACgkQ8W4rlRKtE2dbrQCg3iwUkXZUBHerzyBFq+jOWS91 pgkAn2hYy7pREwdFBQQayR2OLP9x62e9 =tx9A -----END PGP SIGNATURE----- From ncoghlan at gmail.com Wed Oct 26 04:26:00 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 26 Oct 2011 12:26:00 +1000 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: <4EA76CF3.1090900@oddbird.net> References: <4EA5AC93.2020305@oddbird.net> <4EA6F3E8.60203@oddbird.net> <20111025215139.4906418a@resist.wooz.org> <4EA76CF3.1090900@oddbird.net> Message-ID: On Wed, Oct 26, 2011 at 12:14 PM, Carl Meyer wrote: > What about "sys.local_prefix"? Doesn't overload site, but also doesn't > imply that it always points to a venv. I think "local" carries the right > connotation here - this is the prefix for the user's local environment > (as possibly opposed to the "global" system environment). Seems like > this might make code using the attribute unconditionally read a bit less > oddly? I think explaining to people "sys.venv_prefix is still valid when you're not in a virtual env, it just points to the same place as sys.prefix" is easier than explaining a *new* name with " points to the virtual env directory when you're in a virtual environment and sys.prefix otherwise". So while I agree your concern is valid, I think just living with the quirkiness is a reasonable approach. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From wuwei23 at gmail.com Wed Oct 26 05:24:51 2011 From: wuwei23 at gmail.com (alex23) Date: Tue, 25 Oct 2011 20:24:51 -0700 (PDT) Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: References: <4EA5AC93.2020305@oddbird.net> <4EA5E74B.2060906@stoneleaf.us> <20111025101157.7a02df7f@pitrou.net> Message-ID: On Oct 26, 11:26?am, Terry Reedy wrote: > I have a different issue with copying. I have a new Win7 system with a > solid state disk for programs (that usually sit unchanged months or > years at a time). It is nice and fast, but has a finite write-cycle > lifetime. As you said, though, you're using it for primarily static programs. If you're concerned with its lifetime, would you really use it for development? The onus that SSD afficiendos place on others to not use drives as writeable data stores seems absolutely crazy to me. From tjreedy at udel.edu Wed Oct 26 06:50:03 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 26 Oct 2011 00:50:03 -0400 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: References: <4EA5AC93.2020305@oddbird.net> <4EA5E74B.2060906@stoneleaf.us> <20111025101157.7a02df7f@pitrou.net> Message-ID: On 10/25/2011 11:24 PM, alex23 wrote: > On Oct 26, 11:26 am, Terry Reedy wrote: >> I have a different issue with copying. I have a new Win7 system with a >> solid state disk for programs (that usually sit unchanged months or >> years at a time). It is nice and fast, but has a finite write-cycle >> lifetime. > > As you said, though, you're using it for primarily static programs. If > you're concerned with its lifetime, would you really use it for > development? I don't. I put a (static) .pth file in site-packages that points to my development directory on the hard disk, where I happily edit and test sometimes several times an hour. > The onus that SSD afficiendos place on others to not use drives as > writeable data stores seems absolutely crazy to me. I said a) I would prefer to not have an unnecessary duplication but b) if a copy is to be made, I would like to have a choice of which drive to use. That does not strike me as 'absolutely crazy'. -- Terry Jan Reedy From timothy.c.delaney at gmail.com Wed Oct 26 06:54:38 2011 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Wed, 26 Oct 2011 15:54:38 +1100 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: References: <4EA5AC93.2020305@oddbird.net> <4EA5E74B.2060906@stoneleaf.us> <20111025101157.7a02df7f@pitrou.net> Message-ID: On 26 October 2011 14:24, alex23 wrote: > On Oct 26, 11:26 am, Terry Reedy wrote: > > I have a different issue with copying. I have a new Win7 system with a > > solid state disk for programs (that usually sit unchanged months or > > years at a time). It is nice and fast, but has a finite write-cycle > > lifetime. > > As you said, though, you're using it for primarily static programs. If > you're concerned with its lifetime, would you really use it for > development? > > The onus that SSD afficiendos place on others to not use drives as > writeable data stores seems absolutely crazy to me. As such an afficionado, I agree that trying not to use it as a writeable data store seems crazy. All current SSDs have a sufficient number of write-erase cycles that you would need to be writing to the entire drive non-stop for years before it became an issue. If you're worried, get more RAM and configure a RAM disk as your temporary store. If nothing else, it speeds up compiles significantly ... Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Wed Oct 26 07:26:00 2011 From: carl at oddbird.net (Carl Meyer) Date: Tue, 25 Oct 2011 23:26:00 -0600 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: References: <4EA5AC93.2020305@oddbird.net> <4EA5E74B.2060906@stoneleaf.us> <20111025101157.7a02df7f@pitrou.net> Message-ID: <4EA799E8.30101@oddbird.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Terry, On 10/25/2011 10:50 PM, Terry Reedy wrote: > I said a) I would prefer to not have an unnecessary duplication but b) > if a copy is to be made, I would like to have a choice of which drive to > use. As things stand with the PEP, you can create virtualenvs wherever you like, and you can optionally use symlinks if you prefer. Carl -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6nmegACgkQ8W4rlRKtE2dMwQCfevz5GgWK0nl3O4vqegsk902/ JTwAn2yu/MAb9a8nAYk0SJMfO/b32Yih =C3jN -----END PGP SIGNATURE----- From vinay_sajip at yahoo.co.uk Wed Oct 26 11:02:52 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Wed, 26 Oct 2011 09:02:52 +0000 (UTC) Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib References: <4EA5AC93.2020305@oddbird.net> <4EA6F3E8.60203@oddbird.net> <20111025215139.4906418a@resist.wooz.org> <4EA76CF3.1090900@oddbird.net> Message-ID: Nick Coghlan writes: > I think explaining to people "sys.venv_prefix is still valid when > you're not in a virtual env, it just points to the same place as > sys.prefix" is easier than explaining a *new* name with " name> points to the virtual env directory when you're in a virtual > environment and sys.prefix otherwise". I thought Carl was saying that we use "local_prefix" as opposed to "venv_prefix", because the "local_" prefix seems logical and doesn't override "site". In a venv, local_prefix/local_exec_prefix point to the venv, otherwise they have the same values as prefix/exec_prefix. The "venv_XXX" does grate a bit, especially as we're trying to achieve a don't-know-or-care-if- I'm-in-a-venv as much as possible. And thinking more about overriding "site" - what is a "site" anyway? It seems to be a combination of Python version, standard library and a specific set of packages (comprising in the general case include files, shared libs and extension modules) - if I installed 3 different versions of Python on my system, I would have three different sites. By that definition, a venv is really a site, albeit a pared-down one which references shared code and library where possible. So site_prefix/site_exec_prefix don't seem unreasonable to me, or failing that, local_prefix/local_exec_prefix would be preferable to venv_prefix/venv_exec_prefix. Regards, Vinay Sajip From zart at zartsoft.ru Wed Oct 26 15:36:50 2011 From: zart at zartsoft.ru (Konstantin Zemlyak) Date: Wed, 26 Oct 2011 19:36:50 +0600 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: <4EA5AC93.2020305@oddbird.net> References: <4EA5AC93.2020305@oddbird.net> Message-ID: <4EA80CF2.9040008@zartsoft.ru> Carl Meyer wrote: > Vinay Sajip and I are working on a PEP for making "virtual Python > environments" a la virtualenv [1] a built-in feature of Python 3.3. > This idea was first proposed on python-dev by Ian Bicking in February > 2010 [2]. It was revived at PyCon 2011 and has seen discussion on > distutils-sig [3] and more recently again on python-dev [4] [5]. This will be a very nice feature to have indeed. I've got some questions after reading the draft: 1) There is no mention whatsoever about user site-packages directories from PEP-370. Should they be supported inside venvs? Or should they be considered part of system-wide python? 2) virtualenv keeps using platform-specific layout inside venv. So on POSIX for example this allows to install different python versions and implementations (like cpython and pypy, for example) into the very same venv. OTOH on Windows and in Jython there is only \Lib per venv which makes this sharing impossible. Should venvs support such use case? If so, how shebangs should be handled? What layout to endorse? 3) This might be not relevant to this PEP but I wonder how would implementing this proposal affect other implementations like Jython, PyPy and IronPython. Will they be able to implement this functionality the same way? -- Konstantin Zemlyak From barry at python.org Wed Oct 26 17:32:16 2011 From: barry at python.org (Barry Warsaw) Date: Wed, 26 Oct 2011 11:32:16 -0400 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib References: <4EA5AC93.2020305@oddbird.net> <4EA6F3E8.60203@oddbird.net> <20111025215139.4906418a@resist.wooz.org> <4EA76CF3.1090900@oddbird.net> Message-ID: <20111026113216.3c4998ca@resist.wooz.org> On Oct 26, 2011, at 09:02 AM, Vinay Sajip wrote: >I thought Carl was saying that we use "local_prefix" as opposed to >"venv_prefix", because the "local_" prefix seems logical and doesn't override >"site". In a venv, local_prefix/local_exec_prefix point to the venv, otherwise >they have the same values as prefix/exec_prefix. The "venv_XXX" does grate a >bit, especially as we're trying to achieve a don't-know-or-care-if- >I'm-in-a-venv as much as possible. > >And thinking more about overriding "site" - what is a "site" anyway? It seems >to be a combination of Python version, standard library and a specific set of >packages (comprising in the general case include files, shared libs and >extension modules) - if I installed 3 different versions of Python on my >system, I would have three different sites. By that definition, a venv is >really a site, albeit a pared-down one which references shared code and >library where possible. > >So site_prefix/site_exec_prefix don't seem unreasonable to me, or failing >that, local_prefix/local_exec_prefix would be preferable to >venv_prefix/venv_exec_prefix. I agree in general with these observations. What we're currently calling "site" feels more like "system" to me, although we probably can't change that until Python 4000. One other reason I like "local" over "venv" is that "venv" isn't actually a word. While it'll probably make it easier to google, it looks fairly jarring to me. In general I try to avoid made up words and abbreviations because they are harder for non-native English speakers to use (so I've been told by some non-native English speakers). -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From jimjjewett at gmail.com Wed Oct 26 18:04:02 2011 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed, 26 Oct 2011 12:04:02 -0400 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: References: <4EA5AC93.2020305@oddbird.net> <4EA6F3E8.60203@oddbird.net> Message-ID: On Tue, Oct 25, 2011 at 9:15 PM, Vinay Sajip wrote: > Nick Coghlan writes: >> I'd actually prefer that we use the explicit "sys.prefix" and >> "sys.venv_prefix" naming (with the latter set to None when not in a >> virtual env) and possibly offer a convenience API somewhere that hides >> the "sys.prefix if sys.venv_prefix is None else sys.venv_prefix" >> dance. > But why is that better than a site.venv_prefix which points to a venv if you're > in one, and == sys.prefix if you're not? Is there a reason we can't just make sys.* be the virtual environment's version, and sys.base.* be the site-wide version? You could still find sys.base.*, and you could still check whether sys.prefix == sys.base.prefix, but programs that don't get updated will use the sandboxed virtual version by default. -jJ From carl at oddbird.net Wed Oct 26 19:06:38 2011 From: carl at oddbird.net (Carl Meyer) Date: Wed, 26 Oct 2011 11:06:38 -0600 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: References: <4EA5AC93.2020305@oddbird.net> <4EA6F3E8.60203@oddbird.net> Message-ID: <4EA83E1E.6020902@oddbird.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Jim, On 10/26/2011 10:04 AM, Jim Jewett wrote: > On Tue, Oct 25, 2011 at 9:15 PM, Vinay Sajip wrote: >> Nick Coghlan writes: > Is there a reason we can't just make sys.* be the virtual > environment's version, and sys.base.* be the site-wide version? You > could still find sys.base.*, and you could still check whether > sys.prefix == sys.base.prefix, but programs that don't get updated > will use the sandboxed virtual version by default. Yes, this is a reasonable alternative that Vinay and I discussed at some length earlier in development. There are reasonable arguments to be made in both directions - you can read my summary of those arguments in the "Open Questions" section of the PEP. Carl -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6oPh4ACgkQ8W4rlRKtE2ftFwCgmmTHdduy6C4f8fihMs2Q0/Nw E18AoMAExZGageDgI+1BzikBsrzQCYd+ =+bJn -----END PGP SIGNATURE----- From carl at oddbird.net Wed Oct 26 19:48:20 2011 From: carl at oddbird.net (Carl Meyer) Date: Wed, 26 Oct 2011 11:48:20 -0600 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: <4EA80CF2.9040008@zartsoft.ru> References: <4EA5AC93.2020305@oddbird.net> <4EA80CF2.9040008@zartsoft.ru> Message-ID: <4EA847E4.8080802@oddbird.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Thanks Konstantin, great questions. On 10/26/2011 07:36 AM, Konstantin Zemlyak wrote: > 1) There is no mention whatsoever about user site-packages directories > from PEP-370. > > Should they be supported inside venvs? Or should they be considered part > of system-wide python? Currently virtualenv considers user-site-packages part of the global environment; they are not used if you supply --no-site-packages, and are available otherwise. I think this is the right approach: an isolated venv should really be isolated, and it would be odd for venv to provide a mode where user-site-packages is available but global-site-packages are not, since PEP 370 itself has no such provision. I'll add an explicit section on this to the PEP, and verify that the reference implementation handles it correctly. I suppose a case might be made for a mode that makes global-site-packages available but user-site-packages not? My inclination is that this is not necessary, but I will add it as an open question in the PEP, pending further feedback. > 2) virtualenv keeps using platform-specific layout inside venv. So on > POSIX for example this allows to install different python versions and > implementations (like cpython and pypy, for example) into the very same > venv. OTOH on Windows and in Jython there is only \Lib per venv > which makes this sharing impossible. > > Should venvs support such use case? If so, how shebangs should be > handled? What layout to endorse? I don't think venv should go out of its way to support the case of multiple interpreters in a single venv to any greater degree than Python supports multiple interpreters under a single system prefix. In other words, I'm perfectly happy to let this behavior vary by platform, in the same way it already varies at the system level, and not consider it a supported use case. Virtualenv makes no effort to support sharing an env among multiple interpreters, and this has not been a requested feature. Note that "sharing a venv" in the POSIX layout has very few advantages over having a separate venv per interpreter/version, and some significant disadvantages. You don't gain any disk space savings, and you lose a great deal of explicitness, particularly in that installed scripts will generally have a shebang line pointing to whichever interpreter you happened to install them with, possibly overwriting the same script previously installed by a different interpreter. If people on POSIX systems want to do this, that's fine, but I don't think it should be documented or encouraged. > 3) This might be not relevant to this PEP but I wonder how would > implementing this proposal affect other implementations like Jython, > PyPy and IronPython. Will they be able to implement this functionality > the same way? Good question. Virtualenv already supports PyPy and I'm fairly confident PyPy will not have trouble with this PEP. I don't know about IronPython or Jython - as long as they follow the same algorithm for finding sys.prefix based on sys.executable, it should work the same. I will add an "open question" on this topic to the PEP, and hopefully we can get some feedback from the right people when we take this to python-dev. Thanks for the review! Carl -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6oR+QACgkQ8W4rlRKtE2ddegCgxfLsZNtNFsFNxA/bFkBdgBVQ DyYAoJ9Grvu7d4g+EOBKhyHE0l8qwq0x =dfqF -----END PGP SIGNATURE----- From carl at oddbird.net Wed Oct 26 19:57:58 2011 From: carl at oddbird.net (Carl Meyer) Date: Wed, 26 Oct 2011 11:57:58 -0600 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: <4EA5AC93.2020305@oddbird.net> References: <4EA5AC93.2020305@oddbird.net> Message-ID: <4EA84A26.9040006@oddbird.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/24/2011 12:21 PM, Carl Meyer wrote: > If the ``env.cfg`` file also contains a key ``include-system-site`` > with a value of ``false`` (not case sensitive), the ``site`` module > will omit the system site directories entirely. This allows the > virtual environment to be entirely isolated from system site-packages. I'd like to flip this default and make isolation be the default, with a flag to enable the system site-packages. The non-isolated default I borrowed from virtualenv without a great deal of thought, but we have had requests to flip virtualenv's default, and it seems that many current active virtualenv users (100% of those on the mailing list who have replied) are in favor [1]. It's looking likely that we will make that change in virtualenv; I think venv should do the same. The primary use case for venv is isolation, so why not make it the default? Carl [1] http://groups.google.com/group/python-virtualenv/browse_frm/thread/57e11efac3743149 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6oSiYACgkQ8W4rlRKtE2cR6QCeN0Hkuqgjr9kq1HcBp/hl8i26 bYoAn3RorDvJukjcm8OomdjaThUfzXbo =RGNB -----END PGP SIGNATURE----- From carl at oddbird.net Wed Oct 26 20:12:19 2011 From: carl at oddbird.net (Carl Meyer) Date: Wed, 26 Oct 2011 12:12:19 -0600 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: References: <4EA5AC93.2020305@oddbird.net> <4EA6F3E8.60203@oddbird.net> <20111025215139.4906418a@resist.wooz.org> <4EA76CF3.1090900@oddbird.net> Message-ID: <4EA84D83.7010903@oddbird.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Nick, On 10/25/2011 08:26 PM, Nick Coghlan wrote: > I think explaining to people "sys.venv_prefix is still valid when > you're not in a virtual env, it just points to the same place as > sys.prefix" is easier than explaining a *new* name with " name> points to the virtual env directory when you're in a virtual > environment and sys.prefix otherwise". > > So while I agree your concern is valid, I think just living with the > quirkiness is a reasonable approach. FWIW, I agree with Vinay and Barry here. It's not something I care deeply about (since I think direct use of these sys attributes should be discouraged outside the stdlib anyway), but I don't agree that a misnamed attribute is easier to explain than a new name. Carl -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6oTYMACgkQ8W4rlRKtE2dXbgCggLUDFr1cqRrOTlvPH9PoC0y5 Y7QAoMyQtRh22P2ly7TmrS3/SlQncFFd =swWE -----END PGP SIGNATURE----- From jimjjewett at gmail.com Wed Oct 26 20:28:10 2011 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed, 26 Oct 2011 14:28:10 -0400 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: <4EA83E1E.6020902@oddbird.net> References: <4EA5AC93.2020305@oddbird.net> <4EA6F3E8.60203@oddbird.net> <4EA83E1E.6020902@oddbird.net> Message-ID: On Wed, Oct 26, 2011 at 1:06 PM, Carl Meyer wrote: > On 10/26/2011 10:04 AM, Jim Jewett wrote: >> Is there a reason we can't just make sys.* be the virtual >> environment's version, and sys.base.* be the site-wide version? ?You >> could still find sys.base.*, and you could still check whether >> sys.prefix == sys.base.prefix, but programs that don't get updated >> will use the sandboxed virtual version by default. > Yes, this is a reasonable alternative that Vinay and I discussed at some > length earlier in development. There are reasonable arguments to be made > in both directions - you can read my summary of those arguments in the > "Open Questions" section of the PEP. Even after re-reading, I'm not sure I fully understand. As best I can tell, the question is whether sys.prefix should point to the base system (because of header files?) or the virtual sandbox (because of user site packages?) Will header files and such be explicitly hidden from applications running in the virtual environment? It sounded like the answer was "depends on the flag value". But it seems to me that: (a) If the base system's data (such as headers) are explicitly hidden, then there isn't much good reason to then say "oh, by the way, the secret is hidden over here". (b) If they are not explicitly hidden, then they (or more-correct overrides) should also be available from the virtual environment (whether by copying or by symlink, or even by symlink to the whole directory). So I'm still not seeing any situation where a normal program should care about the base value, but I am seeing plenty of situations where it might erroneously ask for that value and test (or install) something outside the virtual environment as a result. -jJ From carl at oddbird.net Wed Oct 26 20:56:21 2011 From: carl at oddbird.net (Carl Meyer) Date: Wed, 26 Oct 2011 12:56:21 -0600 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: References: <4EA5AC93.2020305@oddbird.net> <4EA6F3E8.60203@oddbird.net> <4EA83E1E.6020902@oddbird.net> Message-ID: <4EA857D5.4090401@oddbird.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/26/2011 12:28 PM, Jim Jewett wrote: > On Wed, Oct 26, 2011 at 1:06 PM, Carl Meyer wrote: >> On 10/26/2011 10:04 AM, Jim Jewett wrote: > As best I can tell, the question is whether sys.prefix should point to > the base system (because of header files?) or the virtual sandbox > (because of user site packages?) The standard library is also located always in the base system only. > Will header files and such be explicitly hidden from applications > running in the virtual environment? It sounded like the answer was > "depends on the flag value". But it seems to me that: > > (a) If the base system's data (such as headers) are explicitly hidden, > then there isn't much good reason to then say "oh, by the way, the > secret is hidden over here". > > (b) If they are not explicitly hidden, then they (or more-correct > overrides) > should also be available from the virtual environment (whether by copying > or by symlink, or even by symlink to the whole directory). This dichotomy assumes an even more comprehensive form of isolation (basically that which is provided already by PYTHONHOME). That level of isolation is problematic precisely because it requires either copying or symlinking the entire standard library into every environment (essentially duplicating the base Python installation for every environment, perhaps using symlinks to gain some efficiency). This is an issue because: 1) Not all supported platforms have good support for symlinks, and copying the entire standard library is quite heavyweight if you'll be creating a lot of environments. 2) site-packages is contained within the standard library directory, and you presumably want to override site-packages, so you can't just make a single "lib" symlink to get the stdlib, you have to individually symlink every single module. If you aren't bothered by these caveats, then you can just use PYTHONHOME today, and you have no need for this PEP. The goal of this PEP is to provide site-packages isolation without standard-library isolation (and its attendant downsides), by allowing knowledge of the base system environment (enough to use its standard library) even when Python is running within the virtual environment. The Google code search linked in the PEP demonstrates that there is code out there using sys.prefix to find the standard library (for instance, to exclude it from tracing). This code will no longer function if sys.prefix no longer points to the location where the standard library is actually found. It's possible that there is more code out there using sys.prefix to find site-packages. However, the documented definition of sys.prefix never mentions site-packages, whereas it does mention the standard library and header files. So if we have to choose which code to break, I see a strong argument for adhering to the documented definition rather than breaking it. (I could fairly easily be convinced, however, that practicality beats purity in this case, especially since changing sys.prefix errs in the direction of greater isolation.) (It's worth noting that really no code outside the stdlib should be using any of these sys prefix attributes directly, it should instead use the appropriate APIs in site/sysconfig/distutils). > So I'm still not seeing any situation where a normal program should > care about the base value, but I am seeing plenty of situations where > it might erroneously ask for that value and test (or install) > something outside the virtual environment as a result. The case you are missing is "any code which cares where the standard library or header files are located and is using sys.prefix to figure this out," because it is a design goal of this PEP to not have to provide indirection for those things in the virtual environment layout. Carl -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6oV9UACgkQ8W4rlRKtE2eybQCgtEoLppBSj37rtnkqJ0aAkbq1 rjcAoJruTYLiw4fk/Pf1a68OpAnKhqzl =MuWe -----END PGP SIGNATURE----- From greg.ewing at canterbury.ac.nz Thu Oct 27 04:09:42 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 27 Oct 2011 15:09:42 +1300 Subject: [Python-ideas] Cofunctions - Back to Basics Message-ID: <4EA8BD66.6010807@canterbury.ac.nz> Still feeling in a peppish mood after the last round of PEP 335, I decided to revisit my cofunctions proposal. Last time round, I allowed myself to be excessively swayed by popular opinion, and ended up with something that lacked elegance and failed to address the original issues. So, I've gone right back to my initial idea. The current version of the draft PEP can be found here: http://www.cosc.canterbury.ac.nz/greg.ewing/python/generators/cd_current/cofunction-pep-rev5.txt together with updated examples here: http://www.cosc.canterbury.ac.nz/greg.ewing/python/generators/cd_current/Examples/ and prototype implementation available from: http://www.cosc.canterbury.ac.nz/greg.ewing/python/generators/cofunctions.html In a nutshell, the 'cocall' syntax is gone. Inside a cofunction, if an object being called implements __cocall__, then a cocall (implicit yield-from) is performed automatically, otherwise a normal call is made. -- Greg From ncoghlan at gmail.com Thu Oct 27 04:53:47 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 27 Oct 2011 12:53:47 +1000 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <4EA8BD66.6010807@canterbury.ac.nz> References: <4EA8BD66.6010807@canterbury.ac.nz> Message-ID: On Thu, Oct 27, 2011 at 12:09 PM, Greg Ewing wrote: > So, I've gone right back to my initial idea. The current version > of the draft PEP can be found here: > > http://www.cosc.canterbury.ac.nz/greg.ewing/python/generators/cd_current/cofunction-pep-rev5.txt ---------- Certain objects that wrap other callable objects, notably bound methods, will be given __cocall__ implementations that delegate to the underlying object. ---------- That idea doesn't work when combined with the current idea in the PEP for implicit cocall support inside cofunction definitions. Simple method calls will see "Oh, this has __cocall__" and try to treat it like a cofunction, when the underlying object is actually an ordinary callable. To make it work, you'll need some kind of explicit conversion API (analogous to __iter__). Then, the expansion of calls in a cofunction would look something like: _cofunc = obj.__cofunction__() if _cofunc is None: result = obj.__call__(*args, **kwds) else: result = yield from _cofunc(*args, **kwds) However, you still potentially have some pretty serious ambiguity problems - what happens inside an *actual* yield from clause? Or a for loop iterator expression? Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From greg.ewing at canterbury.ac.nz Thu Oct 27 05:30:41 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 27 Oct 2011 16:30:41 +1300 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> Message-ID: <4EA8D061.5090508@canterbury.ac.nz> On 27/10/11 15:53, Nick Coghlan wrote: > That idea doesn't work when combined with the current idea in the PEP > for implicit cocall support inside cofunction definitions. Simple > method calls will see "Oh, this has __cocall__" and try to treat it > like a cofunction, when the underlying object is actually an ordinary > callable. That's why I've specified that the __cocall__ method can return NotImplemented to signal that the object doesn't support cocalls. When I say "implements __cocall__" I mean "has a __cocall__ method that doesn't return NotImplemented". The prototype implementation makes use of this -- the only difference between a function and a cofunction in that implementation is a flag on the code object. The function object's __cocall__ method returns NotImplemented if that flag is not set. > However, you still potentially have some pretty serious ambiguity > problems - what happens inside an *actual* yield from clause? Or a for > loop iterator expression? There is no ambiguity. If f is a cofunction, then yield from f() is equivalent to yield from (yield from f.__cocall__()) and for x in f(): is equivalent to for x in (yield from f.__cocall__()): Both of these make sense provided that the return value of f() is an iterable. -- Greg From ncoghlan at gmail.com Thu Oct 27 06:13:59 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 27 Oct 2011 14:13:59 +1000 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <4EA8D061.5090508@canterbury.ac.nz> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA8D061.5090508@canterbury.ac.nz> Message-ID: On Thu, Oct 27, 2011 at 1:30 PM, Greg Ewing wrote: > On 27/10/11 15:53, Nick Coghlan wrote: > >> That idea doesn't work when combined with the current idea in the PEP >> for implicit cocall support inside cofunction definitions. Simple >> method calls will see "Oh, this has __cocall__" and try to treat it >> like a cofunction, when the underlying object is actually an ordinary >> callable. > > That's why I've specified that the __cocall__ method can return > NotImplemented to signal that the object doesn't support cocalls. > When I say "implements __cocall__" I mean "has a __cocall__ > method that doesn't return NotImplemented". Ah, I missed that. Could you add a pseudocode expansion, similar to the one in PEP 343? Something like: _make_call = True _cocall = getattr(obj, "__cocall__", None) if _cocall is not None: _cocall_result = _cocall(*args, **kwds) _make_call = _cocall_result is NotImplemented if _make_call: _result = obj(*args, **kwds) else: _result = _cocall_result To expand on the "implicit concurrency" discussion, it's potentially worth explicitly stating a couple more points: - thread preemption can already occur between any two bytecode instructions - exceptions can already be thrown by any expression However, you currently handwave away the question of what constitutes "some suitable form of synchronisation" when it comes to implicit cofunction invocation. With threads, you can using the locking primitives to block other threads. With explicit cocalls, you can manually inspect a block of code to ensure it doesn't contain any yield points. With implicit cocalls, how does one implement an interpreter enforced guarantee that control will not be relinquished to the scheduler within a particular block of code? Perhaps the PEP needs a "with not codef:" construct to revert to normal calling semantics for a section of code within a coroutine? You could still explicitly yield from such a code block, but it would otherwise be the coroutine equivalent of a critical section. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From greg.ewing at canterbury.ac.nz Thu Oct 27 09:44:45 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 27 Oct 2011 20:44:45 +1300 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA8D061.5090508@canterbury.ac.nz> Message-ID: <4EA90BED.4000505@canterbury.ac.nz> Nick Coghlan wrote: > Could you add a pseudocode expansion, similar to > the one in PEP 343? > > Something like: > > _make_call = True > _cocall = getattr(obj, "__cocall__", None) > if _cocall is not None: > _cocall_result = _cocall(*args, **kwds) > _make_call = _cocall_result is NotImplemented > if _make_call: > _result = obj(*args, **kwds) > else: > _result = _cocall_result It's not quite as simple as that, because whether a cocall was done determines whether the return value is subjected do a yield-from or used directly as the result. This could be expressed by saying that result = obj(*args, **kwds) expands into _done = False _cocall = getattr(obj, "__cocall__", None) if _cocall is not None: _iter = _cocall(*args, **kwds) if _iter is not NotImplemented: result = yield from _iter _done = True if not _done: result = obj.__call__(*args, **kwds) > Perhaps the PEP needs a "with not codef:" construct to revert to > normal calling semantics for a section of code within a coroutine? You > could still explicitly yield from such a code block, but it would > otherwise be the coroutine equivalent of a critical section. This is perhaps worth thinking about, but I'm not sure it's worth introducing special syntax for it. If you're working from the mindset of making as few assumptions as possible about suspension points, then your critical sections should be small and confined to the implementations of a few operations on your shared data structures. Now if you have a class such as class Queue: def add_item(self, x): ... def remove_item(): ... then it's already obvious from the fact that these methods are defined with 'def' and not 'codef' that they can't cause a suspension, either directly or indirectly, and will therefore be executed atomically from the coroutine perspective. In other words, wherever you might feel inclined to enclose a set of statements in a 'with not codef' block, you can get the same effect by factoring them out into an ordinary function, the atomicity of which is easy to verify by inspecting the function header. So the motivation for introducing more special syntax here appears weak. -- Greg From greg.ewing at canterbury.ac.nz Thu Oct 27 10:39:41 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 27 Oct 2011 21:39:41 +1300 Subject: [Python-ideas] Cofunctions - Rev 6 In-Reply-To: <4EA90BED.4000505@canterbury.ac.nz> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA8D061.5090508@canterbury.ac.nz> <4EA90BED.4000505@canterbury.ac.nz> Message-ID: <4EA918CD.1050505@canterbury.ac.nz> Another cofunctions revision: http://www.cosc.canterbury.ac.nz/greg.ewing/python/generators/cd_current/cofunction-pep-rev6.txt I've added a formal expansion for cocalls, and sharpened up the argument about critical sections to hopefully make it less handwavey and more persuasive. -- Greg From ncoghlan at gmail.com Thu Oct 27 13:31:13 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 27 Oct 2011 21:31:13 +1000 Subject: [Python-ideas] Cofunctions - Rev 6 In-Reply-To: <4EA918CD.1050505@canterbury.ac.nz> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA8D061.5090508@canterbury.ac.nz> <4EA90BED.4000505@canterbury.ac.nz> <4EA918CD.1050505@canterbury.ac.nz> Message-ID: On Thu, Oct 27, 2011 at 6:39 PM, Greg Ewing wrote: > Another cofunctions revision: > > http://www.cosc.canterbury.ac.nz/greg.ewing/python/generators/cd_current/cofunction-pep-rev6.txt > > I've added a formal expansion for cocalls I think using _iter as the name for the result of the __cocall__ invocation helps make it clear what is going on - it's probably worth doing the same for the costart() definition as well. > and sharpened > up the argument about critical sections to hopefully make > it less handwavey and more persuasive. Definitely less handwavey, but I'm not entirely sold on the "more persuasive" just yet. Consider that Java has a synchronized statement, and then synchronized methods are just a shorthand for wrapping the entire body in a synchronized statement. Consider that explicit locking (either manually or via a with statement) is more common with Python code than hiding the locks inside a decorator. The PEP currently proposes that, for coroutines, it's OK to *only* have the function level locking and no tool for statement level locking. To see why this is potentially an issue, consider the case where you have a function that you've decided should really be a cofunction and you want to update your entire code base accordingly. The interpreter can help in this task, since you'll get errors wherever it is called from an ordinary function, but silent changes of behaviour in cofunctions. The latter is mostly a good thing in terms of making coroutine programming easier (and is pretty much the entire point of the PEP), but it is *not* acceptable if manual auditing is the proposed solution to the 'critical section for implicit cooperative concurrency' problem. After all, one of the big things that makes cooperative threading easier than preemptive threading is that you have well defined points where you may be interrupted. The cofunction PEP in its current form gives up (some of) that advantage by making every explicit function call a potential yield point, but without providing any new tools to help manage that loss of control. That means "self.x += y" is safe, but "self.x = y(self.x)" is not (since y may yield control and self.x may be overwritten based on a call that started with a now out of date value for self.x), and there's no way to make explicit (either to the interpreter or to the reader) that 'y' is guaranteed to be an ordinary function. I honestly believe the PEP would be improved if it offered a way to guarantee ordinary call semantics for a suite inside a cofunction so that if someone unexpectedly turns one of the functions called by that code into a cofunction, it will throw an exception rather than silently violating the assumption that that section of code won't yield control to the coroutine scheduler. Obviously, libraries should never make such a change without substantial warning (since it breaks backwards compatibility), but it would still allow a programmer to make an interpreter-enforced declaration that a particular section of code will never implicitly yield control. I think having such a construct will help qualm many of the fears people had about the original version of the implicit invocation proposal - just as try/except blocks help manage exceptions and explicit locks help manage thread preemption, being able to force ordinary call semantics for a suite would allow people to effectively manage implicit coroutine suspension in cases where they felt it mattered. "with not codef:" was the first spelling I thought of to say "switch off coroutine call semantics for this suite". Another alternative would be "with def:" to say "switch ordinary call semantics back on for this suite", while a third would be to add explicit expressions for *both* "cocall f(*args, **kwds)" *and* "defcall f(*args, **kwds)", such that the choice of "def vs codef" merely changed the default behaviour of call expressions and you could explicitly invoke the other semantics whenever you wanted (although an explicit cocall would automatically turn something into a coroutine, just as yield turns one into a generator). I'm sure there are other colours that bike shed could be painted - my main point is that I think this particular bike shed needs to exist. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From mark at hotpy.org Thu Oct 27 13:39:48 2011 From: mark at hotpy.org (Mark Shannon) Date: Thu, 27 Oct 2011 12:39:48 +0100 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <4EA94055.3080207@canterbury.ac.nz> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA93D02.2030201@hotpy.org> <4EA94055.3080207@canterbury.ac.nz> Message-ID: <4EA94304.1010909@hotpy.org> Greg Ewing wrote: > Mark Shannon wrote: > >> Why not have proper co-routines, instead of hacked-up generators? > > What do you mean by a "proper coroutine"? > A parallel, non-concurrent, thread of execution. It should be able to transfer control from arbitrary places in execution, not within generators. Stackless provides coroutines. Greenlets are also coroutines (I think). Lua has them, and is implemented in ANSI C, so it can be done portably. See: http://www.jucs.org/jucs_10_7/coroutines_in_lua/de_moura_a_l.pdf (One of the examples in the paper uses coroutines to implement generators, which is obviously not required in Python :) ) Cheers, Mark. From arnodel at gmail.com Thu Oct 27 13:45:30 2011 From: arnodel at gmail.com (Arnaud Delobelle) Date: Thu, 27 Oct 2011 12:45:30 +0100 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <4EA8BD66.6010807@canterbury.ac.nz> References: <4EA8BD66.6010807@canterbury.ac.nz> Message-ID: On 27 October 2011 03:09, Greg Ewing wrote: > Still feeling in a peppish mood after the last round of PEP 335, > I decided to revisit my cofunctions proposal. Last time round, I > allowed myself to be excessively swayed by popular opinion, and > ended up with something that lacked elegance and failed to address > the original issues. > > So, I've gone right back to my initial idea. The current version > of the draft PEP can be found here: > > http://www.cosc.canterbury.ac.nz/greg.ewing/python/generators/cd_current/cofunction-pep-rev5.txt > > together with updated examples here: > > http://www.cosc.canterbury.ac.nz/greg.ewing/python/generators/cd_current/Examples/ > > and prototype implementation available from: > > http://www.cosc.canterbury.ac.nz/greg.ewing/python/generators/cofunctions.html > > In a nutshell, the 'cocall' syntax is gone. Inside a cofunction, if > an object being called implements __cocall__, then a cocall (implicit > yield-from) is performed automatically, otherwise a normal call is > made. Hi, I've taken the liberty to translate your examples using a small utility that I wrote some time ago and modified to mimic your proposal (imperfectly, of course, since this runs on unpatched python). FWIW, you can see and download the resulting files here (.html files for viewing, .py files for downloading): http://www.marooned.org.uk/~arno/cofunctions/ These don't need ``yield from`` and will work both in Python 2 and 3. The functionality is implemented with three functions defined in cofunctions.py: cofunction, costart, coreturn. The only changes that made to your examples are the following: 1. Add the line ``from cofunctions import *`` at the start of the file 2. Change codef f(x, y): ... to @cofunction def f(cocall, x, y): ... 3. Change 'cocalls' to cofunctions from within cofunctions as follows: f(x, y) becomes: yield cocall(f, x, y) 4. Change return statements within cofunctions to as follows: return 42 becomes coreturn(42) 5. The 'value' attribute of StopIteration exceptions raised by returning cofunctions becomes a 'covalue' attribute Note that #1, #2 and #4 a trivial changes. To perform #3 is actually simple because trying to call a cofunction raises and exception pointing out the incorrect call, so running the examples show which calls to change to cocalls. I could have avoided #5 but somehow I decided to stick with 'covalue' :) Here is, for example, the 'parser.py' example translated: ---------------------------- parser.py -------------------------- from cofunctions import * import re, sys pat = re.compile(r"(\S+)|(<[^>]*>)") text = " This is a foo file you know. " def run(): parser = costart(parse_items) next(parser) try: for m in pat.finditer(text): token = m.group(0) print("Feeding:", repr(token)) parser.send(token) parser.send(None) # to signal EOF except StopIteration as e: tree = e.covalue print(tree) @cofunction def parse_elem(cocall, opening_tag): name = opening_tag[1:-1] closing_tag = "" % name items = yield cocall(parse_items, closing_tag) coreturn((name, items)) @cofunction def parse_items(cocall, closing_tag = None): elems = [] while 1: token = yield if not token: break # EOF if is_opening_tag(token): elems.append((yield cocall(parse_elem, token))) elif token == closing_tag: break else: elems.append(token) coreturn(elems) def is_opening_tag(token): return token.startswith("<") and not token.startswith(" References: <4EA8BD66.6010807@canterbury.ac.nz> Message-ID: <4EA94C53.2060209@pearwood.info> Greg Ewing wrote: > Still feeling in a peppish mood after the last round of PEP 335, > I decided to revisit my cofunctions proposal. Last time round, I > allowed myself to be excessively swayed by popular opinion, and > ended up with something that lacked elegance and failed to address > the original issues. > > So, I've gone right back to my initial idea. The current version > of the draft PEP can be found here: > > http://www.cosc.canterbury.ac.nz/greg.ewing/python/generators/cd_current/cofunction-pep-rev5.txt As per your later email, there is a newer revision here: http://www.cosc.canterbury.ac.nz/greg.ewing/python/generators/cd_current/cofunction-pep-rev6.txt > together with updated examples here: > > http://www.cosc.canterbury.ac.nz/greg.ewing/python/generators/cd_current/Examples/ As a PEP goes, this makes way too many assumptions about the reader's knowledge of a small corner of Python. Coroutines themselves are relatively new, and hardly in wide-spread use; yield from hasn't even made it into a production version of Python yet. But adding new syntax and built-ins will effect *all* Python programmers, not just the few who use coroutines. The rationale section is utterly unpersuasive to me. You talk about cofunctions being a "streamlined way of writing generator-based coroutines", but you give no reason other than your say so that it is more streamlined. How about comparing the same coroutine written with, and without, a cofunction? You also claim that this will allow "the early detection of certain kinds of error that are easily made when writing such code", but again no evidence is given. You should demonstrate such an error, and the "hard-to-diagnose symptoms" it causes, and some indication whether these errors happen in practice or only in theory. The Cofunction Definitions section does not help me understand the concept at all: (1) You state that it is a "special kind of generator", but don't give any clue as to how it is special. Does it do something other generators can't do? A cofunction is spelled 'codef' instead of 'def', but what difference does that make to the behaviour of the generator you get? How does it behave differently to other generators? (2) Cofunctions, apparently, "may" contain yield or yield from. Presumably that means that yield is optional, otherwise it would be "must" rather than "may". So what sort of generator do you get without a yield? The PEP doesn't give me any clue. If I look at the external examples (the PEP doesn't link to them), I see plenty of examples of syntax, but nothing about semantics. E.g.: codef parse_elem(opening_tag): name = opening_tag[1:-1] closing_tag = "" % name items = parse_items(closing_tag) return (name, items) What does this actually do? (3) Something utterly perplexing to me: "From the outside, the distinguishing feature of a cofunction is that it cannot be called directly from within the body of an ordinary function. An exception is raised if such a call to a cofunction is attempted." Many things can't be called as functions: ints, strings, lists, etc. It simply isn't clear to me how cofunctions are different from any other non-callable object. The PEP should show some examples of what does and doesn't work. In particular, the above as stated implies the following: def spam(): x = cofunction() # fails, since directly inside a function x = cofunction() # succeeds, since not inside a function Surely that isn't what you mean to imply, is it? Is there any prior art in other languages? I have googled on "cofunction", and I get many, many hits to the concept from mathematics (e.g. sine and cosine) but virtually nothing in programming circles. Apart from this PEP, I don't see any sign of prior art, or that cofunction is anything but your own term. If there is prior art, you should describe it, and if not, you should say so. Based on the PEP, it looks to me that the whole premise behind "cofunction" is that it allows the user to write a generator without using the word yield. That *literally* the only thing it does is replace a generator like this: def coroutine(): yield from f() with codef coroutine(): f() with some syntactic sugar to allow Python to automagically tell the difference between ordinary function calls and cofunctions inside a cofunction: "Do What I Mean" functionality. If cofunctions actually are more than merely a way to avoid writing yield, you should explain how and why they are more. You should also explain why the form yield from f() # function call is so special that it deserves a new keyword and a new built-in, while this otherwise semantically identical call: x = f() yield from x # here's one we prepared earlier does not. In the Motivation and Rationale section, you state: If one forgets to use ``yield from`` when it should have been used, or uses it when it shouldn't have, the symptoms that result can be extremely obscure and confusing. I'm not sympathetic to the concept that "remembering which syntax to use is hard, so let's invent even more syntax with non-obvious semantics". I don't believe that remembering to write ``codef`` instead of ``def`` is any easier than remembering to write ``yield from`` instead of ``yield`` or ``return``. -- Steven From ncoghlan at gmail.com Thu Oct 27 14:25:04 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 27 Oct 2011 22:25:04 +1000 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <4EA94304.1010909@hotpy.org> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA93D02.2030201@hotpy.org> <4EA94055.3080207@canterbury.ac.nz> <4EA94304.1010909@hotpy.org> Message-ID: On Thu, Oct 27, 2011 at 9:39 PM, Mark Shannon wrote: > Greg Ewing wrote: >> >> Mark Shannon wrote: >> >>> Why not have proper co-routines, instead of hacked-up generators? >> >> What do you mean by a "proper coroutine"? >> > > A parallel, non-concurrent, thread of execution. > It should be able to transfer control from arbitrary places in execution, > not within generators. > > Stackless provides coroutines. Greenlets are also coroutines (I think). > > Lua has them, and is implemented in ANSI C, so it can be done portably. > See: http://www.jucs.org/jucs_10_7/coroutines_in_lua/de_moura_a_l.pdf > > (One of the examples in the paper uses coroutines to implement generators, > which is obviously not required in Python :) ) (It's kinda late here and I'm writing this without looking up details in the source, so don't be shocked if some of the technical points don't quite align with reality. The gist should still hold) It's definitely an interesting point to think about. The thing that makes generators fall short of being true coroutines is that they don't really have the separate stacks that coroutines need. Instead, they borrow the stack of whatever thread invoked next(), send() or throw(). This means that, whenever a generator yields, that stack needs to be unwound, suspending each affected generator in turn, strung together by references between the generator objects rather than remaining a true frame stack. This means that *every* frame in the stack must be a generator frame for the suspension to reach the outermost generator frame - ordinary function frames can't be suspended like that. So, let's suppose that instead of trying to change the way calls work (to create generator frames all the way down), the coroutine PEP instead proposed a new form of *yield* expression: coyield The purpose of 'codef' would then be to declare that a function maintains it's *own* stack frame, independent of that of the calling thread. Unlike 'yield' (which only suspends the current frame), 'coyield' would instead suspend the entire coroutine, returning control to the frame that invoked next(), send() or throw() on the coroutine. Notably, *coyield* would *not* have any special effect on the specific function that used it - instead, it would just be a runtime error if coyield was encountered and there was no coroutine frame on the stack. That actually sounds reasonably feasible to me (and I like it better than the "generator frames all the way down" approach). There are likely to be some nasty implementation problems teasing out the thread local state from the interpreter core though (and it poses interesting problems for other things like the decimal module that also use thread local state). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From p.f.moore at gmail.com Thu Oct 27 15:18:08 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 27 Oct 2011 14:18:08 +0100 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <4EA94C53.2060209@pearwood.info> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> Message-ID: On 27 October 2011 13:19, Steven D'Aprano wrote: > As a PEP goes, this makes way too many assumptions about the reader's > knowledge of a small corner of Python. Coroutines themselves are relatively > new, and hardly in wide-spread use; yield from hasn't even made it into a > production version of Python yet. But adding new syntax and built-ins will > effect *all* Python programmers, not just the few who use coroutines. I have only really skimmed the PEP, although I've been following this thread so far. And I agree heartily with this comment (and indeed with most of Steven's points). The whole proposal seems to me to be adding a lot of language machinery (at least one new keyword, a new builtin and C API, plus some fairly complex semantic changes) to address a problem that I can't even really understand. To be honest, I think that if a solution this heavyweight is justified, it really should be possible to demonstrate a few compelling examples of the problem, which can be understood without a deep understanding of coroutines. The examples may be relatively shallow (my understanding of why yield from is good was nothing more than 'having to write "for x in gen(): yield x" all the time is a pain') but they should be comprehensible to the average user. It also seems premature to build this PEP on the as yet unreleased yield from statement, before we have any real world experience of how well (or badly) the issues the PEP alludes to can be handled in current Python. I'd love to ask for examples of working code (and preferably real world applications), plus a demonstration of how the PEP simplifies it - but that's not possible in any current version of Python... Paul. PS On the other hand, this is python-ideas, so I guess it's the right place for blue-sky theorising. If that's all this thread is, maybe I should simply ignore it for 18 months or so... :-) From ron3200 at gmail.com Thu Oct 27 17:19:54 2011 From: ron3200 at gmail.com (Ron Adam) Date: Thu, 27 Oct 2011 10:19:54 -0500 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA93D02.2030201@hotpy.org> <4EA94055.3080207@canterbury.ac.nz> <4EA94304.1010909@hotpy.org> Message-ID: <1319728794.22527.53.camel@Gutsy> On Thu, 2011-10-27 at 22:25 +1000, Nick Coghlan wrote: > On Thu, Oct 27, 2011 at 9:39 PM, Mark Shannon wrote: > > Greg Ewing wrote: > >> > >> Mark Shannon wrote: > >> > >>> Why not have proper co-routines, instead of hacked-up generators? > >> > >> What do you mean by a "proper coroutine"? > >> > > > > A parallel, non-concurrent, thread of execution. > > It should be able to transfer control from arbitrary places in execution, > > not within generators. > > > > Stackless provides coroutines. Greenlets are also coroutines (I think). > > > > Lua has them, and is implemented in ANSI C, so it can be done portably. > > See: http://www.jucs.org/jucs_10_7/coroutines_in_lua/de_moura_a_l.pdf > > > > (One of the examples in the paper uses coroutines to implement generators, > > which is obviously not required in Python :) ) > > (It's kinda late here and I'm writing this without looking up details > in the source, so don't be shocked if some of the technical points > don't quite align with reality. The gist should still hold) > > It's definitely an interesting point to think about. The thing that > makes generators fall short of being true coroutines is that they > don't really have the separate stacks that coroutines need. Instead, > they borrow the stack of whatever thread invoked next(), send() or > throw(). This means that, whenever a generator yields, that stack > needs to be unwound, suspending each affected generator in turn, > strung together by references between the generator objects rather > than remaining a true frame stack. > > This means that *every* frame in the stack must be a generator frame > for the suspension to reach the outermost generator frame - ordinary > function frames can't be suspended like that. > > So, let's suppose that instead of trying to change the way calls work > (to create generator frames all the way down), the coroutine PEP > instead proposed a new form of *yield* expression: coyield > > The purpose of 'codef' would then be to declare that a function > maintains it's *own* stack frame, independent of that of the calling > thread. Unlike 'yield' (which only suspends the current frame), > 'coyield' would instead suspend the entire coroutine, returning > control to the frame that invoked next(), send() or throw() on the > coroutine. Notably, *coyield* would *not* have any special effect on > the specific function that used it - instead, it would just be a > runtime error if coyield was encountered and there was no coroutine > frame on the stack. > > That actually sounds reasonably feasible to me (and I like it better > than the "generator frames all the way down" approach). There are > likely to be some nasty implementation problems teasing out the thread > local state from the interpreter core though (and it poses interesting > problems for other things like the decimal module that also use thread > local state). Currently the only way to suspend a generator is at a 'yield' statement. (as far as I know) Would it be possible to allow a "raise SuspendIteration" exception to suspend a generator that could then be continued by calling a __resume__ method on the caught SuspendIteration exception? (It might use the traceback info to restart the generator in that case.) Maybe the scheduler could do something like... for gen in generators: try: # Continue a suspended generator gen._stored_suspenditeration.__resume__() except SuspendIteration as exc: # Continued generator stopped, so save the exception. gen._stored_suspenditeration = exc except StopIteration: generators.remove(gen) This might use the existing stack frame to do the work. I'm thinking if the scheduler could stay out of the way of the yield data path, it may make things easier. Cheers, Ron From sven at marnach.net Thu Oct 27 20:32:08 2011 From: sven at marnach.net (Sven Marnach) Date: Thu, 27 Oct 2011 19:32:08 +0100 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <4EA94C53.2060209@pearwood.info> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> Message-ID: <20111027183208.GH20970@pantoffel-wg.de> Steven D'Aprano schrieb am Do, 27. Okt 2011, um 23:19:31 +1100: > Is there any prior art in other languages? I have googled on > "cofunction", and I get many, many hits to the concept from > mathematics (e.g. sine and cosine) but virtually nothing in > programming circles. Apart from this PEP, I don't see any sign of > prior art, or that cofunction is anything but your own term. If > there is prior art, you should describe it, and if not, you should > say so. The usual term in programming is "coroutine" rather than "cofunction": http://en.wikipedia.org/wiki/Coroutine Cheers, Sven From ethan at stoneleaf.us Thu Oct 27 21:03:31 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 27 Oct 2011 12:03:31 -0700 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <20111027183208.GH20970@pantoffel-wg.de> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> Message-ID: <4EA9AB03.8070302@stoneleaf.us> Sven Marnach wrote: > The usual term in programming is "coroutine" rather than "cofunction": > > http://en.wikipedia.org/wiki/Coroutine That article states that Python has coroutines as of 2.5 -- that's incorrect, isn't it? ~Ethan~ From arnodel at gmail.com Thu Oct 27 21:15:53 2011 From: arnodel at gmail.com (Arnaud Delobelle) Date: Thu, 27 Oct 2011 20:15:53 +0100 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <4EA9AB03.8070302@stoneleaf.us> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> Message-ID: On 27 October 2011 20:03, Ethan Furman wrote: > Sven Marnach wrote: >> >> The usual term in programming is "coroutine" rather than "cofunction": >> >> ? ?http://en.wikipedia.org/wiki/Coroutine > > That article states that Python has coroutines as of 2.5 -- that's > incorrect, isn't it? Generator functions + trampoline = coroutines -- Arnaud From p.f.moore at gmail.com Thu Oct 27 21:33:06 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 27 Oct 2011 20:33:06 +0100 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> Message-ID: On 27 October 2011 20:15, Arnaud Delobelle wrote: >> That article states that Python has coroutines as of 2.5 -- that's >> incorrect, isn't it? > > Generator functions + trampoline = coroutines If I understand, the cofunctions of this thread aren't coroutines as such, they are something intended to make writing coroutines easier in some way. My problem is that it's not at all obvious how they help. That's partly because the generator+trampoline idiom, although possible, is not at all common so that there's little in the way of examples, and even less in the way of common understanding, of how the idiom works and what problems there are putting it into practice. Paul. From solipsis at pitrou.net Fri Oct 28 01:26:53 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 28 Oct 2011 01:26:53 +0200 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib References: <4EA5AC93.2020305@oddbird.net> <4EA5E74B.2060906@stoneleaf.us> <20111025101157.7a02df7f@pitrou.net> Message-ID: <20111028012653.0e0e6a15@pitrou.net> On Tue, 25 Oct 2011 21:26:07 -0400 Terry Reedy wrote: > On 10/25/2011 4:28 AM, Nick Coghlan wrote: > > > Yeah, I realised I don't actually mind if things get copied around on > > Windows - it's the POSIX systems where implicit copying would bother > > me, and that goes to the heart of a longstanding difference in > > packaging philosophy between the two platforms :) > > I have a different issue with copying. I have a new Win7 system with a > solid state disk for programs (that usually sit unchanged months or > years at a time). It is nice and fast, but has a finite write-cycle > lifetime. So unnecessary copies are not nice. Uh, please do not throw around arguments like that without backing them up with concrete numbers. In practice, your SSD will have amply enough write-cycles for venv to work fine. And if it doesn't, you've simply been swindled. Regards Antoine. From ncoghlan at gmail.com Fri Oct 28 01:56:39 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 28 Oct 2011 09:56:39 +1000 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> Message-ID: On Fri, Oct 28, 2011 at 5:33 AM, Paul Moore wrote: > On 27 October 2011 20:15, Arnaud Delobelle wrote: > >>> That article states that Python has coroutines as of 2.5 -- that's >>> incorrect, isn't it? >> >> Generator functions + trampoline = coroutines > > If I understand, the cofunctions of this thread aren't coroutines as > such, they are something intended to make writing coroutines easier in > some way. My problem is that it's not at all obvious how they help. > That's partly because the generator+trampoline idiom, although > possible, is not at all common so that there's little in the way of > examples, and even less in the way of common understanding, of how the > idiom works and what problems there are putting it into practice. I highly recommend reading the article Mark Shannon linked earlier in the thread. I confess I didn't finish the whole thing, but even the first half of it made me realise *why* coroutine programming in Python (sans Stackless or greenlet) is such a pain: *every* frame on the coroutine stack has to be a generator frame in order to support suspending the generator. When a generator calls into an ordinary function, suspension is not possible until control has returned to the main generator frame. What this means is that, for example, if you want to use generators to implement asynchronous I/O, every layer between your top level generator and the asynchronous I/O request *also* has to be a generator, or the suspension doesn't work properly (control will be returned to the innermost function frame, when you really want it to get all the way back to the scheduler). PEP 380 (i.e. "yield from") makes it easier to *create* those stacks of generator frames, but it doesn't make the need for them to go away. Proceeding further down that path (as PEP 3152 does) would essentially partitioning Python programming into two distinct subsets: 'ordinary' programming (where you can freely mix generators and ordinary function frames) and 'generator coroutine' programming (where it *has* to be generators all the way down to get suspension to work). In some ways, this is the situation we have now, where people using Twisted and other explicitly asynchronous libraries have to be continuously aware of the risk of inadvertently calling functions that might block. There's no way to write I/O functions that internally say "if I'm in a coroutine, suspend with an asynchronous I/O request, otherwise perform the I/O request synchronously". Frankly, now that I understand the problem more clearly, attempting to attack it by making it easier to create stacks consisting entirely of generator frames strikes me as a terrible idea. Far better to find a way to have a dynamic "non-local" yield construct that yields from the *outermost* generator frame in the stack, regardless of whether the current frame is a generator or not. Ideally, we would like to make it possible to write code like this: def coroutine_friendly_io(*args, **kwds): if in_coroutine(): return coyield AsychronousRequest(async_op, *args, **kwds) else: return sync_op(*args, **kwds) If you look at the greenlet docs (http://packages.python.org/greenlet/) after reading the article on Lua's coroutines, you'll realise that greenlet implements *symmetric* coroutines - you have to explicitly state which greenlet you are switching to. You can then implement asymmetric coroutines on top of that by always switching to a specific scheduler thread. A more likely model for Python code would be Lua's *asymmetric* coroutines. With these, you couldn't switch to another arbitrary coroutine. Instead, the only thing you could do is suspend yourself and return control to the frame that originally invoked the coroutine. Some hypothetical stacks may help make this clearer: Suspending nested generators with 'yield from': scheduler() --> result = outer_gen.next() --> yield from inner_gen() --> yield 42 After the yield is executed, our stack looks like this: outer_func() The generator stack is gone - it was borrowing the thread's main stack, so suspending unwound it. Instead, each generator *object* is holding a reference to a suspended frame, so they're all kept alive by the following reference chain: outer_func's frame (running) -> outer_gen -> outer_gen's frame (suspended) -> inner_gen -> inner_gen's frame (suspended) There's no object involved that has the ability to keep a *stack* of frame's alive, so as soon as you get an ordinary function on the stack inside the generator, you can't suspend any more. Now, suppose we had greenlet style symmetric coroutines. You might do something like this: scheduler_greenlet() # (Could use the implicit main greenlet, but I think this makes the example clearer) --> result = outer_greenlet.switch() # The scheduler_greenlet.switch() call below returns here outer_greenlet() # Separate stack, not using the thread's main frame stack --> inner_func() # Can have ordinary function frames on the stack --> scheduler_greenlet.switch(42) # Suspends this greenlet Finally, Lua style asymmetric coroutines might look like this: outer_func() --> result = outer_codef.next() # The "coyield" below comes back here outer_codef() # Separate stack, not using the thread's main frame stack --> inner_func() # Just an ordinary function call --> coyield 42 # Actually suspends the whole coroutine To achieve 'natural' coroutine programming, a Lua style asymmetric coroutine approach looks the most promising to me. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Fri Oct 28 02:00:58 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 28 Oct 2011 10:00:58 +1000 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <4EA94C53.2060209@pearwood.info> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> Message-ID: On Thu, Oct 27, 2011 at 10:19 PM, Steven D'Aprano wrote: > The Cofunction Definitions section does not help me understand the concept > at all: > > (1) You state that it is a "special kind of generator", but don't give any > clue as to how it is special. Does it do something other generators can't > do? A cofunction is spelled 'codef' instead of 'def', but what difference > does that make to the behaviour of the generator you get? How does it behave > differently to other generators? > > (2) Cofunctions, apparently, "may" contain yield or yield from. Presumably > that means that yield is optional, otherwise it would be "must" rather than > "may". So what sort of generator do you get without a yield? The PEP doesn't > give me any clue. Yeah, there need to be more references to big motivating use cases/examples of what people are already doing in this space (i.e. Twisted, greenlet, gevent) and how the PEP helps us move towards making asynchronous I/O with an event loop as easy to use as blocking I/O (with or without threads). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From steve at pearwood.info Fri Oct 28 03:01:07 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 28 Oct 2011 12:01:07 +1100 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> Message-ID: <4EA9FED3.6050505@pearwood.info> Nick Coghlan wrote: > On Fri, Oct 28, 2011 at 5:33 AM, Paul Moore wrote: >> On 27 October 2011 20:15, Arnaud Delobelle wrote: >> >>>> That article states that Python has coroutines as of 2.5 -- that's >>>> incorrect, isn't it? >>> Generator functions + trampoline = coroutines >> If I understand, the cofunctions of this thread aren't coroutines as >> such, they are something intended to make writing coroutines easier in >> some way. My problem is that it's not at all obvious how they help. >> That's partly because the generator+trampoline idiom, although >> possible, is not at all common so that there's little in the way of >> examples, and even less in the way of common understanding, of how the >> idiom works and what problems there are putting it into practice. > > I highly recommend reading the article Mark Shannon linked earlier in If you're talking about this: http://www.jucs.org/jucs_10_7/coroutines_in_lua/de_moura_a_l.pdf I have read it, and while all very interesting, I don't see how it answers the big questions about motivating use-cases for cofunctions as described in this PEP. One specific thing I took out of this is that only the main body of a Python generator can yield. That is, if I write this: def gen(): def inner(): yield 1 yield 2 yield 0 inner() # yield 1 and 2 yield 3 it does not behave as I would like. Instead, I have to write this: def gen(): def inner(): yield 1 yield 2 yield 0 # In Python 3.3, the next 2 lines become "yield from inner()" for x in inner(): yield x yield 3 I can see how that would be a difficulty, particularly when you move away from simple generators yielding values to coroutines that accept values, but isn't that solved by the "yield from" syntax? > the thread. I confess I didn't finish the whole thing, but even the > first half of it made me realise *why* coroutine programming in Python > (sans Stackless or greenlet) is such a pain: *every* frame on the > coroutine stack has to be a generator frame in order to support > suspending the generator. I understand that, or at least I *think* I understand that, but I don't understand what that implies in practice when writing Python code. If all you are saying is that you can't suspend an arbitrary function at at arbitrary point, well, true, but that's a good thing, surely? The idea of a function is that it has one entry point, it does its thing, and then it returns. If you want different behaviour, don't use a function. Or do you mean something else? Actual working Python code (or not working, as the case may be) would probably help. > Ideally, we would like to make it possible to write code like this: > > def coroutine_friendly_io(*args, **kwds): > if in_coroutine(): > return coyield AsychronousRequest(async_op, *args, **kwds) > else: > return sync_op(*args, **kwds) Why would you want one function to do double duty as both blocking and non-blocking? Particularly when the two branches don't appear to share any code (at least not in the example as given). To me, this seems like "Do What I Mean" magic which would be better written as a pair of functions: def coroutine_friendly_io(*args, **kwds): yield from AsychronousRequest(async_op, *args, **kwds) def blocking_io(*args, **kwargs): return sync_op(*args, **kwds) -- Steven From ron3200 at gmail.com Fri Oct 28 04:23:51 2011 From: ron3200 at gmail.com (Ron Adam) Date: Thu, 27 Oct 2011 21:23:51 -0500 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> Message-ID: <1319768631.3605.50.camel@Gutsy> On Fri, 2011-10-28 at 09:56 +1000, Nick Coghlan wrote: > Frankly, now that I understand the problem more clearly, > attempting to attack it by making it easier to create stacks > consisting entirely of generator frames strikes me as a > terrible idea. It seems to me, we can do better. The 'yield' and 'yield from' statements are great for a single generator. And that's fine. Once we start running multiples co-routines and start to switch between them, things start to get very complex. A trampoline works by flattening out the stack so any sub-generator is always just below the trampoline runner. The trade off is that the trampoline runner then has to sort out how to handle what is yielded to it. That becomes additional overhead that then requires the sub-generators to also pass out signals to the trampoline runner or scheduler. It all goes through the same yield data paths, so it becomes additional overhead to sort that out. In the tests I've done, once you get to that point, they start to run at about the same speed as using class's with a .run() method. No generators required to do that as the method call and return points correspond to one loop in a generator. To avoid the additional overhead, a second suspend point that can yield out to a scheduler would be nice. It would avoid a lot of 'if-else's as the data yield path isn't mixed with trampoline and scheduler messages. I think there have been requests for "channels", but I'm not sure if that would solve these issues directly. > Far better to find a > way to have a dynamic "non-local" yield construct that yields from the > *outermost* generator frame in the stack, regardless of whether the > current frame is a generator or not. Named yields won't work of course, and you are probably not referring to the non_local keyword, but exceptions can have names, so non_local could be used get an exception defined in an outer scope. It's not unreasonable for exceptions to effect the control flow as StopIteration already does that. It's also not unreasonable for an Exception to only work in a few contexts. > Ideally, we would like to make it possible to write code like his: > > def coroutine_friendly_io(*args, **kwds): > if in_coroutine(): > return coyield AsychronousRequest(async_op, *args, **kwds) > else: > return sync_op(*args, **kwds) It's hard to tell just how this would work without seeing it's context. As I said above, once you start getting into more complex designs, it starts to require additional signals and checks of some sort as it appears you have done in this example. Another issue with this is, the routines and the framework become tied together. The routines need to know the proper additional protocol to work with that framework, and they also can't be used with any other framework. That probably isn't avoidable entirely. But if we take that into account at the start, we can probably make things much easier later. Cheers, Ron From ncoghlan at gmail.com Fri Oct 28 05:48:42 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 28 Oct 2011 13:48:42 +1000 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <4EA9FED3.6050505@pearwood.info> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> <4EA9FED3.6050505@pearwood.info> Message-ID: On Fri, Oct 28, 2011 at 11:01 AM, Steven D'Aprano wrote: > If all you are saying is that you can't suspend an arbitrary function at at > arbitrary point, well, true, but that's a good thing, surely? The idea of a > function is that it has one entry point, it does its thing, and then it > returns. If you want different behaviour, don't use a function. > > Or do you mean something else? Actual working Python code (or not working, > as the case may be) would probably help. The number 1 use case for coroutines is doing asynchronous I/O (and other event loop based programming) elegantly. That's why libraries/frameworks like Twisted and gevent exist. The *only* reason to add support to the core is if we can do it in a way that makes programs that use coroutines as straightforward to write as threaded programs. Coroutines are essentially useful for the same reason any form of cooperative multitasking is useful - it's a way to structure a program as concurrently executed sequences of instructions without incurring the overhead of an OS level thread (or process!) for each operation stream. >> Ideally, we would like to make it possible to write code like this: >> >> ? ?def coroutine_friendly_io(*args, **kwds): >> ? ? ? ?if in_coroutine(): >> ? ? ? ? ? ?return coyield AsychronousRequest(async_op, *args, **kwds) >> ? ? ? ?else: >> ? ? ? ? ? ?return sync_op(*args, **kwds) > > Why would you want one function to do double duty as both blocking and > non-blocking? Particularly when the two branches don't appear to share any > code (at least not in the example as given). To me, this seems like "Do What > I Mean" magic which would be better written as a pair of functions: > > ? ?def coroutine_friendly_io(*args, **kwds): > ? ? ? ?yield from AsychronousRequest(async_op, *args, **kwds) > > ? ?def blocking_io(*args, **kwargs): > ? ? ? ?return sync_op(*args, **kwds) If you can't merge the synchronous and asynchronous version of your I/O routines, it means you end up having to write everything in the entire stack twice - once in an "event loop friendly" way (that promises never to block - generator based coroutines are just a way to help with writing code like that) and once in the normal procedural way. That's why Twisted has its own version of virtually all the I/O libraries in the standard library - the impedance mismatch between our the standard library's blocking I/O model and the needs of event loop based programming is usually just too big to overcome at the library/framework level without significant duplication of functionality. If the "block or suspend?" decision can instead be made at the lowest layers (as is currently possible with greenlets), then the intervening layers *don't need to care* what actually happens under the hood and the two worlds can start to move closer together. Compare the current situation with cooperative multitasking to the comparative ease of programming with threaded code, where, with the GIL in place, there's a mixture of pre-emptive multi-tasking (the interpreter can switch threads between any two bytecodes) and cooperative multi-tasking (voluntarily relinquishing the GIL), all of which is handled at the *lowest* layer in the stack, and the upper layers generally couldn't care in the least which thread they're running in. You still have the inherent complexity of coping with shared data access in a concurrent environment to deal with, of course, but at least the code you're running in the individual threads is just ordinary Python code. The cofunction PEP as it stands does nothing to help with that bifurcation problem, so I now see it as basically pointless. By contrast, the greenlets module (and, before that, Stackless itself) helps bring cooperative multitasking up to the same level as threaded programming - within a greenlet, you're just writing ordinary Python code. It just so happens that some of the functions you call may suspend your frame stack and start running a different one for a while. The fact that greenlets exists (and works) demonstrates that it is possible to correctly manage and execute multiple Python stacks within a single OS thread. It's probably more worthwhile to focus on providing either symmetric (greenlets style, "switch to specific coroutine") or asymmetric (Lua style, "suspend and pass control back to the frame that invoked the coroutine") coroutines rather than assuming that the current "generators all the way down" restriction is an inviolable constraint. As far as a concrete example goes, let's consider a simple screen echo utility that accepts an iterable and prints whatever comes out: def echo_all(iterable): for line in iterable: print(line) So far so good. I can run that in my main OS thread, or in a subthread and it will all work happily. I can even pass in an operation that reads lines from a file or a socket. OK, the socket case sounds potentially interesting, and it's obvious how it works with blocking IO, but what about asynchronous IO and an event loop? In current Python, the answer is that it *doesn't* work, *unless* you use greenlets (or Stackless). If you want to use generator based coroutines instead, you have to write a *new* version of echo_all and it's a mess, because you have requests to the event loop (e.g. "let me know when this socket has data ready") and the actual data to be printed intermingled in the same communications channel. And that all has to be manually passed back up to the event loop, because there's no standardised way for the socket read operation to say "look, just suspend this entire call stack until the socket has some data for me". Having something like a codef/coyield pair that allows the entire coroutine stack to be suspended means a coroutine can use the given "echo_all" function as is. All that needs to happen is that: 1. echo_all() gets invoked (directly or indirectly) from a coroutine defined with codef 2. the passed in iterable uses coyield whenever it wants to pass control back to the event loop that called the coroutine in the first place The intervening layer in the body of echo_all() can then remain blissfully unaware that anything unusual is going on, just as it currently has no idea when a socket read blocks, allowing other OS threads to execute while waiting for data to arrive. Yes, the nonblocking I/O operations would need to have a protocol that they use to communicate with the event loop, but again, one of the virtues of true coroutines is that the details of that event loop implementation specific protocol can be hidden behind a standardised API. Under such a scheme, 'return x' and 'yield x' and function calls would all retain their usual meanings. Instead of directly invoking 'coyield x', nonblocking I/O operations might write: event_loop.resume_when_ready(handle) # Ask event loop to let us know when data is available The need for trampoline scheduling largely goes away, since you only need to go back to the event loop when you're going to do something that may block. It's even possible that new keywords wouldn't be needed (as greenlets demonstrates). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From mark at hotpy.org Fri Oct 28 11:33:36 2011 From: mark at hotpy.org (Mark Shannon) Date: Fri, 28 Oct 2011 10:33:36 +0100 Subject: [Python-ideas] Implementing Coroutines (was Cofunctions - Back to Basics) In-Reply-To: <4EA94304.1010909@hotpy.org> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA93D02.2030201@hotpy.org> <4EA94055.3080207@canterbury.ac.nz> <4EA94304.1010909@hotpy.org> Message-ID: <4EAA76F0.6080105@hotpy.org> Hi, Full coroutines have a number of advantages over the proposed cofunctions. For one, coroutines require *no* language changes. Here is a simple coroutine class interface (feel free to substitute/add your own names/signatures). class Coroutine: def __init__(self, callable): ... def resume(value): '''Passes value to coroutine, either as initial parameter to callable or as return value of yield''' def co_yield(value): 'Yields (returns) value back to caller of resume() method.' def stop(): 'Stops the coroutine' In order to implement this, it must be possible to transfer control *sideways* from one frame to another, without unwinding either stack. In order to do this, Python-to-Python calls within the interpreter must not make calls at the C (OS/hardware) level. This is quite simple to implement in practice. Currently CPython does something like this (for Python to Python calls). In interpreter: call PyFunction.__call__(parameters) In PyFunction.__call__(): Create new frame with parameters. Call interpreter with new frame. By changing the call sequence to something like: In interpreter: frame = PyFunction.make_frame(parameters) Push frame to frame-stack jump to start of new function. we have a 'stackless' implementation which can support coroutines. The implementation of Coroutines in a portable fashion is explained on pages 7 and 8 of this paper: http://www.lua.org/doc/jucs05.pdf (Sorry to keep quoting Lua papers, but it is a cleaner VM than CPython. Python is a better language, though :) ) Stackless uses a slightly different approach, not for any fundamental reason, but because their goal is to minimise the diff with CPython. Adding coroutines may create problems for the other implementations. PyPy already has a stackless implementation, but Jython and IronPython might find to it harder to implement. Any Jython or IronPython folks care to comment? In summary, the cofunction proposal is a work-around for a limitation in the VM. By fixing the VM we can have proper coroutines. Surely, it is better to make the VM support the features we want/need rather than bend those features to fit the VM? Cheers, Mark. From p.f.moore at gmail.com Fri Oct 28 12:10:29 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 28 Oct 2011 11:10:29 +0100 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> Message-ID: On 28 October 2011 00:56, Nick Coghlan wrote: > I highly recommend reading the article Mark Shannon linked earlier in > the thread. I confess I didn't finish the whole thing, but even the > first half of it made me realise *why* coroutine programming in Python > (sans Stackless or greenlet) is such a pain: *every* frame on the > coroutine stack has to be a generator frame in order to support > suspending the generator. Thanks for the link, and the detailed explanation. I will read the link, but as it's about Lua coroutines I am probably at least partially aware of the details, as I used to program a little in Lua, and understood coroutines from there (and other places, the idea is familiar to me). I'm not against coroutines in principle (they certainly do have uses) but I wonder whether this PEP (and indeed PEP 342, where the basics of coroutines in Python came from) actually offer the sort of programming interface that people will actually use. To borrow further from Lua, they provide coroutines via a standard library module: coroutine.create(fn) -- convert a function to a coroutine co.resume(arg...) -- resume a coroutine, with arguments coroutine.running() -- the current coroiutine co.status() -- running, suspended, normal or dead coroutine.wrap(f) -- helper combining create and resume coroutine.yield(arg...) -- yield back to whoever called resume There's obviously language runtime support behind the scenes, but conceptually, this is the sort of model I think people can make real use of. It's precisely the sort of asymmetric model you mention later, and I agree that it's more user-friendly (albeit less powerful) than fully symmetric coroutines. So in my view, any coroutine PEP should be measured (at least in part) against how well it maps to these sorts of core concepts, at least if it's aimed at a non-specialist userbase. > PEP 380 (i.e. "yield from") makes it easier to *create* those stacks > of generator frames, but it doesn't make the need for them to go away. > Proceeding further down that path (as PEP 3152 does) would essentially > partitioning Python programming into two distinct subsets: 'ordinary' > programming (where you can freely mix generators and ordinary function > frames) and 'generator coroutine' programming (where it *has* to be > generators all the way down to get suspension to work). This comment begs the question, is it the right thing to do to split Python programming into two subsets, as you suggest? > Frankly, now that I understand the problem more clearly, attempting to > attack it by making it easier to create stacks consisting entirely of > generator frames strikes me as a terrible idea. Far better to find a > way to have a dynamic "non-local" yield construct that yields from the > *outermost* generator frame in the stack, regardless of whether the > current frame is a generator or not. Hmm, OK. Sort of... I think the effect is correct, but it makes my head hurt having to think about it in terms of the internals of stacks and frames rather than in terms of my code... (I spend a few minutes thinking that the non-local yield should go *to* the generator frame, rather than yield *from* it. And doesn't this make generator loops within generator loops impossible? I don't now if they are needed, but I'd hate to bake in a limitation like that without considering if it's OK.) > Ideally, we would like to make it possible to write code like this: > > ? ?def coroutine_friendly_io(*args, **kwds): > ? ? ? ?if in_coroutine(): > ? ? ? ? ? ?return coyield AsychronousRequest(async_op, *args, **kwds) > ? ? ? ?else: > ? ? ? ? ? ?return sync_op(*args, **kwds) > > If you look at the greenlet docs > (http://packages.python.org/greenlet/) after reading the article on > Lua's coroutines, you'll realise that greenlet implements *symmetric* > coroutines - you have to explicitly state which greenlet you are > switching to. You can then implement asymmetric coroutines on top of > that by always switching to a specific scheduler thread. Given that the greenlet library exists (and is used by other projects, according to its PyPI page) why all the focus on core support? Seriously, how does the greenlet library fail to provide whatever users need? (Other than "it's not in the core" which could be fixed simply by adding it to the stdlib). I can't see any notes in the greenlet documentation which imply there are limitations/issues that might be a problem. > To achieve 'natural' coroutine programming, a Lua style asymmetric > coroutine approach looks the most promising to me. +1. Although I'm not sure whether this needs to be in the core (i.e. with language and/or syntax support), or in the stdlib, or just as a wrapper round the greenlet library. Paul From mark at hotpy.org Fri Oct 28 12:16:16 2011 From: mark at hotpy.org (Mark Shannon) Date: Fri, 28 Oct 2011 11:16:16 +0100 Subject: [Python-ideas] Implementing Coroutines (was Cofunctions - Back to Basics) In-Reply-To: <4EAA76F0.6080105@hotpy.org> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA93D02.2030201@hotpy.org> <4EA94055.3080207@canterbury.ac.nz> <4EA94304.1010909@hotpy.org> <4EAA76F0.6080105@hotpy.org> Message-ID: <4EAA80F0.7010104@hotpy.org> Errata to previous email. > > def co_yield(value): > 'Yields (returns) value back to caller of resume() method.' Should have been @staticmethod def co_yield(value): 'Yields (returns) value back to caller of resume() method.' Cheers, Mark. From ubershmekel at gmail.com Fri Oct 28 12:58:19 2011 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Fri, 28 Oct 2011 12:58:19 +0200 Subject: [Python-ideas] Implementing Coroutines (was Cofunctions - Back to Basics) In-Reply-To: <4EAA80F0.7010104@hotpy.org> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA93D02.2030201@hotpy.org> <4EA94055.3080207@canterbury.ac.nz> <4EA94304.1010909@hotpy.org> <4EAA76F0.6080105@hotpy.org> <4EAA80F0.7010104@hotpy.org> Message-ID: I'm sorry, I still don't understand what the problem here is. I didn't have any trouble making a python implementation for the wikipedia coroutine example: http://codepad.org/rEgg4GzW On Fri, Oct 28, 2011 at 12:16 PM, Mark Shannon wrote: > Errata to previous email. > > > > > > def co_yield(value): > > 'Yields (returns) value back to caller of resume() method.' > > Should have been > > @staticmethod > > def co_yield(value): > 'Yields (returns) value back to caller of resume() method.' > > Cheers, > Mark. > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at hotpy.org Fri Oct 28 14:09:03 2011 From: mark at hotpy.org (Mark Shannon) Date: Fri, 28 Oct 2011 13:09:03 +0100 Subject: [Python-ideas] Implementing Coroutines (was Cofunctions - Back to Basics) In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA93D02.2030201@hotpy.org> <4EA94055.3080207@canterbury.ac.nz> <4EA94304.1010909@hotpy.org> <4EAA76F0.6080105@hotpy.org> <4EAA80F0.7010104@hotpy.org> Message-ID: <4EAA9B5F.2050804@hotpy.org> Yuval Greenfield wrote: > I'm sorry, I still don't understand what the problem here is. I didn't > have any trouble making a python implementation for the wikipedia > coroutine example: > > http://codepad.org/rEgg4GzW Where is the yield method? You need to be able to call Coroutine.co_yield() from *anywhere* not just in a generator. Suppose we have a Tree class with a walk method which takes a callback function: class Tree: class Node: def visit(self, callback): if self.left: self.left.visit(callback) if self.right: self.right.visit(callback) callback(self.value) def walk(self, callback): '''Recursively walks the tree calling callback(node) for each node''' self.root.visit(callback) We can then use Coroutine to create a tree iterator, with something like: def tree_callback(node): Coroutine.co_yield(node) #Make an iterator from a coroutine def tree_iterator(tree): co = Coroutine(tree.walk) yield co.resume(tree_callback) while True: yield co.resume(None) (I am glossing over questions like: Should a stopping coroutine raise an exception from the resume method or just return a value. Should a stopped coroutine that is resumed raise a StopIteration exception, a GeneratorExit exception or some new exception, etc, etc...) The important point here is that Tree.walk() is recursive and knows nothing of generators or coroutines, yet can be made to drive a generator. > > > > On Fri, Oct 28, 2011 at 12:16 PM, Mark Shannon > wrote: > > Errata to previous email. > > > > > > def co_yield(value): > > 'Yields (returns) value back to caller of resume() method.' > > Should have been > > @staticmethod > > def co_yield(value): > 'Yields (returns) value back to caller of resume() method.' > > Cheers, > Mark. > _________________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/__mailman/listinfo/python-ideas > > > From ncoghlan at gmail.com Fri Oct 28 14:11:26 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 28 Oct 2011 22:11:26 +1000 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> Message-ID: On Fri, Oct 28, 2011 at 8:10 PM, Paul Moore wrote: > On 28 October 2011 00:56, Nick Coghlan wrote: >> PEP 380 (i.e. "yield from") makes it easier to *create* those stacks >> of generator frames, but it doesn't make the need for them to go away. >> Proceeding further down that path (as PEP 3152 does) would essentially >> partitioning Python programming into two distinct subsets: 'ordinary' >> programming (where you can freely mix generators and ordinary function >> frames) and 'generator coroutine' programming (where it *has* to be >> generators all the way down to get suspension to work). > > This comment begs the question, is it the right thing to do to split > Python programming into two subsets, as you suggest? That's actually the status quo - you can right code normally, or you can write it inside out Twisted style. >> Frankly, now that I understand the problem more clearly, attempting to >> attack it by making it easier to create stacks consisting entirely of >> generator frames strikes me as a terrible idea. Far better to find a >> way to have a dynamic "non-local" yield construct that yields from the >> *outermost* generator frame in the stack, regardless of whether the >> current frame is a generator or not. > > Hmm, OK. Sort of... I think the effect is correct, but it makes my > head hurt having to think about it in terms of the internals of stacks > and frames rather than in terms of my code... (I spend a few minutes > thinking that the non-local yield should go *to* the generator frame, > rather than yield *from* it. And doesn't this make generator loops > within generator loops impossible? I don't now if they are needed, but > I'd hate to bake in a limitation like that without considering if it's > OK.) Nah, the real beauty of a new mechanism is that it would leave the existing generator channels untouched, so we wouldn't be running *any* risk of breaking it. Generators would still use an ordinary yield and borrow the stack frame of the caller - it's only the new coroutines that would be creating a truly independent frame stack. >> If you look at the greenlet docs >> (http://packages.python.org/greenlet/) after reading the article on >> Lua's coroutines, you'll realise that greenlet implements *symmetric* >> coroutines - you have to explicitly state which greenlet you are >> switching to. You can then implement asymmetric coroutines on top of >> that by always switching to a specific scheduler thread. > > Given that the greenlet library exists (and is used by other projects, > according to its PyPI page) why all the focus on core support? > Seriously, how does the greenlet library fail to provide whatever > users need? (Other than "it's not in the core" which could be fixed > simply by adding it to the stdlib). I can't see any notes in the > greenlet documentation which imply there are limitations/issues that > might be a problem. I believe greenlets supports switching even when there's a C function on the stack, and they use hand written assembly code to do it. So the answer to the question "Why not just use greenlets?" is the following file (and its friends in that directory): https://bitbucket.org/ambroff/greenlet/src/77363116e78d/platform/switch_amd64_unix.h That said, *without* that kind of assembly code, we'd probably face the same problem Lua does (i.e. coroutines can't suspend when there's a C function on the frame stack), which could potentially limit the feature's usefulness (e.g. you migrate a function to C or Cython and suddenly your coroutines break). >> To achieve 'natural' coroutine programming, a Lua style asymmetric >> coroutine approach looks the most promising to me. > > +1. Although I'm not sure whether this needs to be in the core (i.e. > with language and/or syntax support), or in the stdlib, or just as a > wrapper round the greenlet library. I'm not sure either - I suspect that without the low level assembly code that greenlets relies on a coroutine library for Python would need support in the core to do the frame stack switching. Years ago, I believe Guido was amenable to the idea of merging changes back from Stackless, maybe he'd be open to the idea of supporting a bit of assembly code in order to get full fledged coroutines in the standard library. It has the usual activation barrier though (i.e. someone seeking the blessing of the greenlets authors, writing a PEP and championing it against all comers). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From stephen at xemacs.org Fri Oct 28 16:16:21 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 28 Oct 2011 23:16:21 +0900 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> Message-ID: <87pqhh2ocq.fsf@uwakimon.sk.tsukuba.ac.jp> Nick Coghlan writes: > That's actually the status quo - you can right code normally, or you > can write it inside out Twisted style. Is that to say that once Twisted-style code falls down, it can't get back up again? Sounds like the Irish side of my family. Some-typos-aren't-caught-by-automatic-spellcheckers-ly y'rs, From carl at oddbird.net Fri Oct 28 17:48:50 2011 From: carl at oddbird.net (Carl Meyer) Date: Fri, 28 Oct 2011 09:48:50 -0600 Subject: [Python-ideas] Draft PEP for virtualenv in the stdlib In-Reply-To: <4EA5AC93.2020305@oddbird.net> References: <4EA5AC93.2020305@oddbird.net> Message-ID: <4EAACEE2.5040607@oddbird.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Thanks again to everyone who offered feedback, questions, and suggestions on the draft PEP. I've made substantial revisions and additions to the PEP based on discussion here. A couple changes to call out specifically that I haven't already mentioned: * I've attempted to add more clarity and detail to the discussion of ``sys.prefix`` vs ``sys.site_prefix``, based on Jim's questions. * After discussion with Vinay, we're sticking with ``sys.site_prefix`` naming for now in the PEP and reference implementation. I've added an "open questions" section summarizing the discussion here, the concerns Nick raised about that name, and both the ``sys.venv_prefix`` and ``sys.local_prefix`` alternative suggestions. We'll see what additional feedback there is from python-dev on that question. Sending the draft on to the PEP editors and python-dev now, as outlined in PEP 1. Thanks! Carl -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk6qzuIACgkQ8W4rlRKtE2eS2ACfQmWpO4lsxYvZdhJaDqCC3X9j GkYAn0P7dtgk0GbsAXA/36mvTONGa127 =3Sck -----END PGP SIGNATURE----- From merwok at netwok.org Fri Oct 28 18:15:22 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Fri, 28 Oct 2011 18:15:22 +0200 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: <4EA32507.7010900@pearwood.info> References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> Message-ID: <4EAAD51A.9030608@netwok.org> Hi, Le 22/10/2011 22:18, Steven D'Aprano a ?crit : > I'm just going to repeat what I've said before: explicit is better than > implicit. If you want the name of an object (be it a class, a module, a > function, or something else), you should explicitly ask for the name, > and not rely on its str(). > > [...] > > But for the sake of the argument, I'll grant you that we're free to > change str(cls) to return the class name, as requested by the OP, or the > fully qualified module.class dotted name as suggested by you. So let's > suppose that, after a long and bitter debate over which colour to paint > this bikeshed, you win the debate. Hm. Sometimes we want the class name, sometimes module.class, so even with the change we won?t always be able to use str(cls). > But this doesn't help you at all, because you can't rely on it. It seems > to me that the exact format of str(cls) is an implementation detail. You > can't rely on other Pythons to do the same thing, nor can you expect a > guarantee that str(cls) won't change again in the future. So if you care > about the exact string that gets generated, you still have to explicitly > use cls.__name__ just as you do now. This is a very good point. The output of repr and str is not (TTBOMK) exactly defined or guaranteed; nonetheless, I expect that many people (including me) rely on some conversions (like the fact that repr('somestr') includes quotes). So we can change str(cls) and say that *now* it has defined output, or leave it alone to avoid breaking code that does depend on the output, which can be seen as a wrong thing or a pragmatic thing (?I need it and it works?). Regards From guido at python.org Fri Oct 28 19:51:47 2011 From: guido at python.org (Guido van Rossum) Date: Fri, 28 Oct 2011 10:51:47 -0700 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: <4EAAD51A.9030608@netwok.org> References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> <4EAAD51A.9030608@netwok.org> Message-ID: On Fri, Oct 28, 2011 at 9:15 AM, ?ric Araujo wrote: > Le 22/10/2011 22:18, Steven D'Aprano a ?crit : >> I'm just going to repeat what I've said before: explicit is better than >> implicit. If you want the name of an object (be it a class, a module, a >> function, or something else), you should explicitly ask for the name, >> and not rely on its str(). >> >> [...] >> >> But for the sake of the argument, I'll grant you that we're free to >> change str(cls) to return the class name, as requested by the OP, or the >> fully qualified module.class dotted name as suggested by you. So let's >> suppose that, after a long and bitter debate over which colour to paint >> this bikeshed, you win the debate. > > Hm. ?Sometimes we want the class name, sometimes module.class, so even > with the change we won?t always be able to use str(cls). It is a well-known fact of humanity that you can't please anyone. There's not that much data on how often the full name is better; my hunch however is that most of the time the class name is sufficiently unique within the universe of classes that could be printed, and showing the module name just feels pedantic. Apps that know just the name is sub-optimal should stick to rendering using cls.__module__ and cls.__name__. >> But this doesn't help you at all, because you can't rely on it. It seems >> to me that the exact format of str(cls) is an implementation detail. You >> can't rely on other Pythons to do the same thing, nor can you expect a >> guarantee that str(cls) won't change again in the future. So if you care >> about the exact string that gets generated, you still have to explicitly >> use cls.__name__ just as you do now. > > This is a very good point. > > The output of repr and str is not (TTBOMK) exactly defined or > guaranteed; nonetheless, I expect that many people (including me) rely > on some conversions (like the fact that repr('somestr') includes > quotes). ?So we can change str(cls) and say that *now* it has defined > output, or leave it alone to avoid breaking code that does depend on the > output, which can be seen as a wrong thing or a pragmatic thing (?I need > it and it works?). In my view, str() and repr() are both for human consumption (though in somewhat different contexts). If tweaking them helps humans understand the output better then let's tweak them. If you as a developer feel particularly anal about how you want your object printed, you should avoid repr() or str() and write your own formatting function. If as a programmer you feel the urge to go parse the output of repr() or str(), you should always *know* that a future version of Python can break your code, and you should file a feature request to have an API added to the class so you won't have to parse the repr() or str(). -- --Guido van Rossum (python.org/~guido) From van.lindberg at gmail.com Fri Oct 28 23:53:10 2011 From: van.lindberg at gmail.com (VanL) Date: Fri, 28 Oct 2011 16:53:10 -0500 Subject: [Python-ideas] Draft PEP for the regularization of Python install layouts Message-ID: In part inspired by the virtualenv-in-the-stdlib PEP, I figured that it might be a good time to draft up a PEP to fix one of my regular annoyances: the ever-so-slightly different layouts for Python between platforms. For someone who develops on Windows and a Mac and deploys on Linux, this is a major pain in the rear - so much so that I always change my system Python install to have matching environments. I harbor no illusions that this is necessarily a thorn in anyone else's side, or that this will go anywhere. However, I have written it up and I would love any feedback. Thanks, Van ===================================== PEP: XXX Title: Consistent Python Environment Layout Across Platforms Version: $Revision$ Last-Modified: $Date$ Author: Van Lindberg Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 28-Oct-2011 Python-Version: 3.3 Post-History: 28-Oct-2011 Abstract ======== Python currently uses different environment layouts based upon the underlying operating system and the interpreter implementation language. This PEP proposes to regularize the directory layout for Python environments. Motivation ========== One of Python's strengths is its cross-platform appeal. Carefully- written Python programs are frequently portable between operating systems and Python implementations with very few changes. Over the years, substantial effort has been put into maintaining platform parity and providing consistent interfaces to available functionality, even when different underlying implementations are necessary (such as with ntpath and posixpath). One place where Python is unnecessarily different, however, is in the layout and organization of the Python environment. This is most visible in the name of the directory for binaries on the Windows platform ("Scripts") versus the name of the directory for binaries on every other platform ("bin"), but a full listing of the layouts shows substantial differences in layout and capitalization across platforms. Sometimes the include is capitalized ("Include"), and sometimes not; and the python version may or may not be included in the path to the standard library or not. The differences between platforms become particularly noticable when attempting to do cross-platform development in an isolated environment like a `virtualenv`_ or the proposed Python Virtual Environments. Differences between platforms get hard-coded in or worked around so that the same code will just work without having to artificially take the platform into account. This information can also be made available to Python tools (like virtualenv, `distribute`_, or `pip`_) so that environment directories can be more easily found and changed. Regularizing the Python environment layout across Python versions will lower the differences between Python versions, making it more consistent for all users, and enforcing a unified API for virtual environments and installers rather than ad-hoc detection or hard-coded values. .. _virtualenv: http://www.virtualenv.org .. _distribute: http://packages.python.org/distribute/ .. _pip: http://www.pip-installer.org/ Specification ============= When the Python binary is executed, it imports both sysconfig and distutils.command.install. These two modules contain slightly different versions of the environment layout. They are: From sysconfig: _INSTALL_SCHEMES = { 'posix_prefix': { 'stdlib': '{base}/lib/python{py_version_short}', 'platstdlib': '{platbase}/lib/python{py_version_short}', 'purelib': '{base}/lib/python{py_version_short}/site-packages', 'platlib': '{platbase}/lib/python{py_version_short}/site-packages', 'include': '{base}/include/python{py_version_short}', 'platinclude': '{platbase}/include/python{py_version_short}', 'scripts': '{base}/bin', 'data': '{base}', }, 'posix_home': { 'stdlib': '{base}/lib/python', 'platstdlib': '{base}/lib/python', 'purelib': '{base}/lib/python', 'platlib': '{base}/lib/python', 'include': '{base}/include/python', 'platinclude': '{base}/include/python', 'scripts': '{base}/bin', 'data' : '{base}', }, 'nt': { 'stdlib': '{base}/Lib', 'platstdlib': '{base}/Lib', 'purelib': '{base}/Lib/site-packages', 'platlib': '{base}/Lib/site-packages', 'include': '{base}/Include', 'platinclude': '{base}/Include', 'scripts': '{base}/Scripts', 'data' : '{base}', }, 'os2': { 'stdlib': '{base}/Lib', 'platstdlib': '{base}/Lib', 'purelib': '{base}/Lib/site-packages', 'platlib': '{base}/Lib/site-packages', 'include': '{base}/Include', 'platinclude': '{base}/Include', 'scripts': '{base}/Scripts', 'data' : '{base}', }, 'os2_home': { 'stdlib': '{userbase}/lib/python{py_version_short}', 'platstdlib': '{userbase}/lib/python{py_version_short}', 'purelib': '{userbase}/lib/python{py_version_short}/site-packages', 'platlib': '{userbase}/lib/python{py_version_short}/site-packages', 'include': '{userbase}/include/python{py_version_short}', 'scripts': '{userbase}/bin', 'data' : '{userbase}', }, 'nt_user': { 'stdlib': '{userbase}/Python{py_version_nodot}', 'platstdlib': '{userbase}/Python{py_version_nodot}', 'purelib': '{userbase}/Python{py_version_nodot}/site-packages', 'platlib': '{userbase}/Python{py_version_nodot}/site-packages', 'include': '{userbase}/Python{py_version_nodot}/Include', 'scripts': '{userbase}/Scripts', 'data' : '{userbase}', }, 'posix_user': { 'stdlib': '{userbase}/lib/python{py_version_short}', 'platstdlib': '{userbase}/lib/python{py_version_short}', 'purelib': '{userbase}/lib/python{py_version_short}/site-packages', 'platlib': '{userbase}/lib/python{py_version_short}/site-packages', 'include': '{userbase}/include/python{py_version_short}', 'scripts': '{userbase}/bin', 'data' : '{userbase}', }, 'osx_framework_user': { 'stdlib': '{userbase}/lib/python', 'platstdlib': '{userbase}/lib/python', 'purelib': '{userbase}/lib/python/site-packages', 'platlib': '{userbase}/lib/python/site-packages', 'include': '{userbase}/include', 'scripts': '{userbase}/bin', 'data' : '{userbase}', }, } From distutils.command.install: if sys.version < "2.2": WINDOWS_SCHEME = { 'purelib': '$base', 'platlib': '$base', 'headers': '$base/Include/$dist_name', 'scripts': '$base/Scripts', 'data' : '$base', } else: WINDOWS_SCHEME = { 'purelib': '$base/Lib/site-packages', 'platlib': '$base/Lib/site-packages', 'headers': '$base/Include/$dist_name', 'scripts': '$base/Scripts', 'data' : '$base', } INSTALL_SCHEMES = { 'unix_prefix': { 'purelib': '$base/lib/python$py_version_short/site-packages', 'platlib': '$platbase/lib/python$py_version_short/site-packages', 'headers': '$base/include/python$py_version_short/$dist_name', 'scripts': '$base/bin', 'data' : '$base', }, 'unix_home': { 'purelib': '$base/lib/python', 'platlib': '$base/lib/python', 'headers': '$base/include/python/$dist_name', 'scripts': '$base/bin', 'data' : '$base', }, 'unix_user': { 'purelib': '$usersite', 'platlib': '$usersite', 'headers': '$userbase/include/python$py_version_short/$dist_name', 'scripts': '$userbase/bin', 'data' : '$userbase', }, 'nt': WINDOWS_SCHEME, 'nt_user': { 'purelib': '$usersite', 'platlib': '$usersite', 'headers': '$userbase/Python$py_version_nodot/Include/$dist_name', 'scripts': '$userbase/Scripts', 'data' : '$userbase', }, 'os2': { 'purelib': '$base/Lib/site-packages', 'platlib': '$base/Lib/site-packages', 'headers': '$base/Include/$dist_name', 'scripts': '$base/Scripts', 'data' : '$base', }, 'os2_home': { 'purelib': '$usersite', 'platlib': '$usersite', 'headers': '$userbase/include/python$py_version_short/$dist_name', 'scripts': '$userbase/bin', 'data' : '$userbase', }, } There is an API call (sysconfig.get_config_var) that is used to resolve these various variables, but there is no way to set them other than directly clobbering _INSTALL_SCHEMES/INSTALL_SCHEME dicts holding the information (in both sysconfig and in distutils.command.install). This PEP proposes to change the default installation layout as follows: - Define a new config var, py_binaries, and a corresponding module-level variable, _PY_BINARIES with the default value 'bin' - Change sysconfig's _INSTALL_SCHEMES to INSTALL_SCHEMES, and make it part of the public API. A default scheme equivalent to the current posix_prefix would be added and made to be the default. - To support virtual environments, a special key "current" would be added to INSTALL_SCHEMES. The value associated with "current" would be the name of the currently active scheme. During activation, a virtual environment could place a custom layout in the INSTALL_SCHEMES dict and set "current" to the custom scheme. This could also be used to support legacy or custom layouts. This would be done as follows:: from _collections import defaultdict _DEFAULT_SCHEME = { 'stdlib': '{userbase}/lib/python{py_version_short}', 'platstdlib': '{userbase}/lib/python{py_version_short}', 'purelib': '{userbase}/lib/python{py_version_short}/site-packages', 'platlib': '{userbase}/lib/python{py_version_short}/site-packages', 'include': '{userbase}/include/python{py_version_short}', 'headers': '{userbase}/include/python{py_version_short}', 'scripts': '{userbase}/{py_binaries}', 'data' : '{userbase}', } INSTALL_SCHEMES = defaultdict(lambda: _DEFAULT_SCHEME.copy()) INSTALL_SCHEMES['install'] = _DEFAULT_SCHEME INSTALL_SCHEMES['current'] = 'install' - Change distutils.command.install to use information from INSTALL_SCHEMES instead of embedding its own copy of the dict. By using a defaultdict, prior users of the API could transparently query the dict using the platform value and receive back a reasonable response. The current virtual environment would be retrievable by getting INSTALL_SCHEMES[INSTALL_SCHEMES['current']], and the installed scheme retrievable by getting INSTALL_SCHEMES['install']. Open Issues ========== By default, OS X Framework installs use a format that is different than the one specified above. Possible responses would be to change the layout just for OS X, but special-casing as little as possible. The double indirection to get the current install scheme is a little ugly, but I am not sure what would be better. This explicitly removes the compatibility code in distutils for 2.x install layouts (particularly layouts pre-2.3). This is not considered a major issue, as 2.2 code will largely not be compatible with 3.3, and an individual user could add a custom layout if needed. Based on actual use, some of the distinctions that appeared necessary when distutils and sysconfig were first written (platlib, purelib, include, headers) seem to be duplicative. As written above, the distinctions are maintained for ease of moving to the standardized layout, but it may be worth choosing just one name and sticking with it. From python at zesty.ca Sat Oct 29 00:00:57 2011 From: python at zesty.ca (Ka-Ping Yee) Date: Fri, 28 Oct 2011 15:00:57 -0700 (PDT) Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> <4EAAD51A.9030608@netwok.org> Message-ID: Hi there, I get that repr() is supposed to be the precise representation and str() is intended more to be friendly than precise. My concern with the proposal is just that this: >>> print x foo ...doesn't actually feel that friendly to me. I want to know that it's *probably* a function or *probably* a class, the same way that today, when I see: >>> print x biscuit >>> print y [1, 2, 3] I can guess that x is *probably* a string and y is *probably* a list (e.g. because I know I'm not working with any custom objects whose __str__ returns those things). It would create a slightly higher mental burden (or slightly higher probability of human error) if, when I see: >>> print x Splat ...I have to remember that x might be a string or a function or a class. I'd just like some kind of visual hint as to what it is. Like: >>> print x foo() or: >>> print x function foo or: >>> print x function foo(a, b) or: >>> print x class Bar In fact "function foo(a, b)" would actually be rather useful in a lot of situations, and I would argue, friendlier than "foo". --Ping From greg.ewing at canterbury.ac.nz Sat Oct 29 00:10:52 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 29 Oct 2011 11:10:52 +1300 Subject: [Python-ideas] Cofunctions - Rev 6 In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA8D061.5090508@canterbury.ac.nz> <4EA90BED.4000505@canterbury.ac.nz> <4EA918CD.1050505@canterbury.ac.nz> Message-ID: <4EAB286C.7000406@canterbury.ac.nz> Nick Coghlan wrote: > I think having such a > construct will help qualm many of the fears people had about the > original version of the implicit invocation proposal - just as > try/except blocks help manage exceptions and explicit locks help > manage thread preemption, being able to force ordinary call semantics > for a suite would allow people to effectively manage implicit > coroutine suspension in cases where they felt it mattered. Maybe, but I'm still not convinced that simply factoring out the critical section into a 'def' function isn't sufficient to achieve the same ends of auditability and protection from unexpected changes in library semantics. Also, elsewhere you're arguing that the ideal situation would be for there to be no distinction at all between normal code and coroutine code, and any piece of code would be able to carry out a coroutine suspension. Would you still want a critical section construct in such a world, and if so, how exactly would it work? -- Greg From greg.ewing at canterbury.ac.nz Sat Oct 29 00:15:48 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 29 Oct 2011 11:15:48 +1300 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <4EA94304.1010909@hotpy.org> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA93D02.2030201@hotpy.org> <4EA94055.3080207@canterbury.ac.nz> <4EA94304.1010909@hotpy.org> Message-ID: <4EAB2994.6010103@canterbury.ac.nz> Mark Shannon wrote: > Stackless provides coroutines. Greenlets are also coroutines (I think). > > Lua has them, and is implemented in ANSI C, so it can be done portably. These all have drawbacks. Greenlets are based on non-portable (and, I believe, slightly dangerous) C hackery, and I'm given to understand that Lua coroutines can't be suspended from within a C function. My proposal has limitations, but it has the advantage of being based on fully portable and well-understood techniques. -- Greg From ethan at stoneleaf.us Sat Oct 29 00:20:23 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 28 Oct 2011 15:20:23 -0700 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> <4EAAD51A.9030608@netwok.org> Message-ID: <4EAB2AA7.10406@stoneleaf.us> Ka-Ping Yee wrote: > I'd just like some kind of visual hint as to what it is. Like: > > >>> print x > foo() > > or: > > >>> print x > function foo > > or: > > >>> print x > function foo(a, b) > > or: > > >>> print x > class Bar > > In fact "function foo(a, b)" would actually be rather useful > in a lot of situations, and I would argue, friendlier than "foo". +1 If we're gonna make a change, let's make it a great one. :) ~Ethan~ From greg.ewing at canterbury.ac.nz Sat Oct 29 00:27:28 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 29 Oct 2011 11:27:28 +1300 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> Message-ID: <4EAB2C50.2090300@canterbury.ac.nz> Arnaud Delobelle wrote: > Hi, I've taken the liberty to translate your examples using a small > utility that I wrote some time ago and modified to mimic your proposal > (imperfectly, of course, since this runs on unpatched python). FWIW, > you can see and download the resulting files here (.html files for > viewing, .py files for downloading): > > http://www.marooned.org.uk/~arno/cofunctions/ Thanks for your efforts, although I'm not sure how much it helps to see them this way -- the whole point is to enable writing such code *without* going through any great contortions. I suppose they serve as an example of what you would be saved from writing. -- Greg From ethan at stoneleaf.us Sat Oct 29 00:40:02 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 28 Oct 2011 15:40:02 -0700 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <4EAB2994.6010103@canterbury.ac.nz> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA93D02.2030201@hotpy.org> <4EA94055.3080207@canterbury.ac.nz> <4EA94304.1010909@hotpy.org> <4EAB2994.6010103@canterbury.ac.nz> Message-ID: <4EAB2F42.2020704@stoneleaf.us> Greg Ewing wrote: > Mark Shannon wrote: > >> Stackless provides coroutines. Greenlets are also coroutines (I think). >> >> Lua has them, and is implemented in ANSI C, so it can be done portably. > > These all have drawbacks. Greenlets are based on non-portable > (and, I believe, slightly dangerous) C hackery, and I'm given > to understand that Lua coroutines can't be suspended from > within a C function. > > My proposal has limitations, but it has the advantage of > being based on fully portable and well-understood techniques. If Stackless has them, could we use that code? ~Ethan~ From tjreedy at udel.edu Sat Oct 29 00:58:34 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 28 Oct 2011 18:58:34 -0400 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> <4EAAD51A.9030608@netwok.org> Message-ID: On 10/28/2011 1:51 PM, Guido van Rossum wrote: > In my view, str() and repr() are both for human consumption (though in > somewhat different contexts). If tweaking them helps humans understand > the output better then let's tweak them. With that explanation, I am fine with whatever you decide. Without expensive study, 'human friendliness' is ultimately a judgment call. > If you as a developer feel > particularly anal about how you want your object printed, you should > avoid repr() or str() and write your own formatting function. The only guarantee in the str doc is that str(astring) is astring. (The repr doc says much more, but you are not currently proposing to change that.) With that guarantee, users have complete control of output. When comparing actual to expected strings, I should and will continue using .__name__ and .format(). > If as a programmer you feel the urge to go parse the output of repr() > or str(), you should always *know* that a future version of Python can > break your code, doctest is known to be fragile and has gimmicks already to avoid breakage. That tail should not wag the dog. > and you should file a feature request to have an API > added to the class so you won't have to parse the repr() or str(). I agree that classes should make all vital info available as Python objects that one can compute with. A similar issue came on the tracker with the suggestion that the dis module should gain a computation friendly interface to return the objects that currently get formatted into each line. -- Terry Jan Reedy From greg.ewing at canterbury.ac.nz Sat Oct 29 01:37:22 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 29 Oct 2011 12:37:22 +1300 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <4EA94C53.2060209@pearwood.info> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> Message-ID: <4EAB3CB2.4030805@canterbury.ac.nz> Steven D'Aprano wrote: > (1) You state that it is a "special kind of generator", but don't give > any clue as to how it is special. You're supposed to read the rest of the specification to find that out. > (2) Cofunctions, apparently, "may" contain yield or yield from. > Presumably that means that yield is optional, otherwise it would be > "must" rather than "may". So what sort of generator do you get without a > yield? The PEP doesn't give me any clue. A cofunction needs to be able to 'yield', because that's the way you suspend a cofunction-based coroutine. It needs to still be a generator even if it doesn't contain a 'yield', because it may call another cofunction that does yield. So the fact that it's defined with 'codef' makes it a generator, regardless of whether it directly contains any yields. > "From the outside, the distinguishing feature of a cofunction is that it > cannot be called directly from within the body of an ordinary function. > An exception is raised if such a call to a cofunction is attempted." > > Many things can't be called as functions: ints, strings, lists, etc. It > simply isn't clear to me how cofunctions are different from any other > non-callable object. I mean it's what distinguishes cofunctions from functions defined with 'def'. The comparison is between cofunctions and normal functions, not between cofunctions and any non-callable object. > the above as stated implies the following: > > def spam(): > x = cofunction() # fails, since directly inside a function > > x = cofunction() # succeeds, since not inside a function > > Surely that isn't what you mean to imply, is it? You're right, I didn't mean to imply that. A better way to phrase it would be "A cofunction can only be called directly from the body of another cofunction. An exception is raised if an attempt is made to call it in any other context." > Is there any prior art in other languages? I have googled on > "cofunction", and I get many, many hits to the concept from mathematics > (e.g. sine and cosine) but virtually nothing in programming circles. It's my own term, not based on any prior art that I'm aware of. It seemed like a natural way to combine the notion of "coroutine" with "function" in the Python sense of something defined with 'def'. (I have seen the word used once in a paper relating to functional programming. The author drew a distinction between "functions" operating on finite data, and "cofunctions" operating on "cofinite" "codata". But apart from the idea of dividing functions into two disjoint classes, it was unrelated to what I'm talking about.) > In the Motivation and Rationale section, you state: > > If one forgets to use ``yield from`` when it should have > been used, or uses it when it shouldn't have, the symptoms > that result can be extremely obscure and confusing. > > I don't believe that remembering to write ``codef`` instead of ``def`` > is any easier than remembering to write ``yield from`` instead of > ``yield`` or ``return``. It's easier because if you forget, you get told about it loudly and clearly. Whereas if you forget to write "yield from" where you should have, nothing goes wrong immediately -- the call succeeds and returns something. The trouble is, it's an iterator rather than the function return value you were expecting. If you're lucky, this will trip up something not too much further down. If you're unlucky, the incorrect value will get returned to a higher level or stored away somewhere, to cause problems much later when it's far from obvious where it came from. The reason this is an issue is that it's much easier to make this kind of mistake when using generators as coroutines rather than iterators. Normally when you call a generator, the purpose you have in mind is "produce a stream of things", in which case it's obvious that you need to do something more than just a call, such as iterating over the result or using "yield from" to pass them on to your caller. But when using yield-from to call a subfunction in a coroutine, the purpose you have in mind is not "produce a stream of things" but simply "do something" or "calculate a value". And this is the same as the purpose you have in mind for all the ordinary calls that *don't* require -- and in fact must *not* have -- yield-from in front of them. So there is great room for confusion! What's more, when using generators in the usual way, the thing you're calling is designed from the outset to return an iterator as part of its contract, and that is not likely to change. However, it's very likely that parts of your coroutine that initially were just ordinary functions will later need to be able to suspend themselves. When that happens, you need to track down all the places you call it from and add "yield from" in front of them -- all the time wrestling with the kinds of less-than-obvious symptoms that I described above. With cofunctions, on the other hand, this process is straightforward and almost automatic. The interpreter will tell you exactly which functions have a problem, and you fix them by changing their definitions from 'def' to 'codef'. -- Greg From barry at python.org Sat Oct 29 01:46:22 2011 From: barry at python.org (Barry Warsaw) Date: Fri, 28 Oct 2011 19:46:22 -0400 Subject: [Python-ideas] Changing str(someclass) to return only the class name References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> <4EAAD51A.9030608@netwok.org> Message-ID: <20111028194622.2b7fbb09@resist.wooz.org> On Oct 28, 2011, at 03:00 PM, Ka-Ping Yee wrote: > >>> print x > foo > >...doesn't actually feel that friendly to me. I want to know >that it's *probably* a function or *probably* a class >>> print x None What is x? :) -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From steve at pearwood.info Sat Oct 29 03:10:00 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 29 Oct 2011 12:10:00 +1100 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <4EAB3CB2.4030805@canterbury.ac.nz> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <4EAB3CB2.4030805@canterbury.ac.nz> Message-ID: <4EAB5268.1010803@pearwood.info> Greg Ewing wrote: > Steven D'Aprano wrote: > >> (1) You state that it is a "special kind of generator", but don't give >> any clue as to how it is special. > > You're supposed to read the rest of the specification to find that > out. I did. Twice. It means little to me. The PEP makes too many assumptions about the reader's understanding of the issues involved. If this was a proposal for a new library, I'd say "It doesn't effect me, I can ignore it, I just won't use the library". But you're talking about adding syntax and changing the execution model of Python. This becomes part of the language. I'm not hostile to the idea. But the PEP needs to reference concrete examples that people can run to see that there is a problem to be solved. Otherwise, it just seems that you're proposing adding more complication to the Python, at least one new keyword and one more builtin, for what gain? -- Steven From ncoghlan at gmail.com Sat Oct 29 03:59:53 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 29 Oct 2011 11:59:53 +1000 Subject: [Python-ideas] Cofunctions - Rev 6 In-Reply-To: <4EAB286C.7000406@canterbury.ac.nz> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA8D061.5090508@canterbury.ac.nz> <4EA90BED.4000505@canterbury.ac.nz> <4EA918CD.1050505@canterbury.ac.nz> <4EAB286C.7000406@canterbury.ac.nz> Message-ID: On Sat, Oct 29, 2011 at 8:10 AM, Greg Ewing wrote: > Nick Coghlan wrote: >> >> I think having such a >> construct will help qualm many of the fears people had about the >> original version of the implicit invocation proposal - just as >> try/except blocks help manage exceptions and explicit locks help >> manage thread preemption, being able to force ordinary call semantics >> for a suite would allow people to effectively manage implicit >> coroutine suspension in cases where they felt it mattered. > > Maybe, but I'm still not convinced that simply factoring out > the critical section into a 'def' function isn't sufficient to > achieve the same ends of auditability and protection from > unexpected changes in library semantics. > > Also, elsewhere you're arguing that the ideal situation would > be for there to be no distinction at all between normal code > and coroutine code, and any piece of code would be able to > carry out a coroutine suspension. Would you still want a > critical section construct in such a world, and if so, how > exactly would it work? It would trigger a runtime error if any of the called functions attempted to suspend the coroutine. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sat Oct 29 04:09:51 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 29 Oct 2011 12:09:51 +1000 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <4EAB5268.1010803@pearwood.info> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <4EAB3CB2.4030805@canterbury.ac.nz> <4EAB5268.1010803@pearwood.info> Message-ID: On Sat, Oct 29, 2011 at 11:10 AM, Steven D'Aprano wrote: > I'm not hostile to the idea. But the PEP needs to reference concrete > examples that people can run to see that there is a problem to be solved. > Otherwise, it just seems that you're proposing adding more complication to > the Python, at least one new keyword and one more builtin, for what gain? Indeed, this is why I think the PEP needs to spend more time talking about Twisted and gevent. They both attack the problem of coroutines in Python, one by writing within the limitations of what already exists (which is why it's really a programming paradigm unto itself), the other by lifting some core components out of Stackless (i.e. the greenlets module) in order to get true coroutines. The PEP is basically about making the Twisted-style approach less problematic. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sat Oct 29 04:10:43 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 29 Oct 2011 12:10:43 +1000 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <4EAB2F42.2020704@stoneleaf.us> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA93D02.2030201@hotpy.org> <4EA94055.3080207@canterbury.ac.nz> <4EA94304.1010909@hotpy.org> <4EAB2994.6010103@canterbury.ac.nz> <4EAB2F42.2020704@stoneleaf.us> Message-ID: On Sat, Oct 29, 2011 at 8:40 AM, Ethan Furman wrote: > Greg Ewing wrote: >> >> Mark Shannon wrote: >> >>> Stackless provides coroutines. Greenlets are also coroutines (I think). >>> >>> Lua has them, and is implemented in ANSI C, so it can be done portably. >> >> These all have drawbacks. Greenlets are based on non-portable >> (and, I believe, slightly dangerous) C hackery, and I'm given >> to understand that Lua coroutines can't be suspended from >> within a C function. >> >> My proposal has limitations, but it has the advantage of >> being based on fully portable and well-understood techniques. > > If Stackless has them, could we use that code? That's what the greenlets module *is* - the coroutine code from Stackless, lifted out and provided as an extension module instead of a forked version of the runtime. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sat Oct 29 04:13:13 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 29 Oct 2011 12:13:13 +1000 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <4EAB2994.6010103@canterbury.ac.nz> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA93D02.2030201@hotpy.org> <4EA94055.3080207@canterbury.ac.nz> <4EA94304.1010909@hotpy.org> <4EAB2994.6010103@canterbury.ac.nz> Message-ID: On Sat, Oct 29, 2011 at 8:15 AM, Greg Ewing wrote: > Mark Shannon wrote: > >> Stackless provides coroutines. Greenlets are also coroutines (I think). >> >> Lua has them, and is implemented in ANSI C, so it can be done portably. > > These all have drawbacks. Greenlets are based on non-portable > (and, I believe, slightly dangerous) C hackery, and I'm given > to understand that Lua coroutines can't be suspended from > within a C function. > > My proposal has limitations, but it has the advantage of > being based on fully portable and well-understood techniques. The limitation of Lua style coroutines is that they can't be suspended from inside a function implemented in C. Without greenlets/Stackless style assembly code, coroutines in Python would likely have the same limitation. PEP 3152 (and all generator based coroutines) have the limitation that they can't suspend if there's a *Python* function on the stack. Can you see why I know consider this approach categorically worse than one that pursued the Lua approach? Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From greg.ewing at canterbury.ac.nz Sat Oct 29 08:23:31 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 29 Oct 2011 19:23:31 +1300 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> Message-ID: <4EAB9BE3.8050809@canterbury.ac.nz> Paul Moore wrote: > PS On the other hand, this is python-ideas, so I guess it's the right > place for blue-sky theorising. If that's all this thread is, maybe I > should simply ignore it for 18 months or so... :-) Yes, the intended audience for the PEP is currently the rather small set of people who have been following the yield-from discussions and understand what it's about. I fully expect that yield-from will need quite some time to bed in before any decisions about something further can be made. I'm thinking a long way ahead here. -- Greg From greg.ewing at canterbury.ac.nz Sat Oct 29 08:40:38 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 29 Oct 2011 19:40:38 +1300 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <4EA9FED3.6050505@pearwood.info> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> <4EA9FED3.6050505@pearwood.info> Message-ID: <4EAB9FE6.6000607@canterbury.ac.nz> Steven D'Aprano wrote: > One specific thing I took out of this is that only the main body of a > Python generator can yield. > > I can see how that would be a difficulty, particularly when you move > away from simple generators yielding values to coroutines that accept > values, but isn't that solved by the "yield from" syntax? Only for limited values of "solved" -- to my mind, it's still a rather unnatural way to write coroutine code. Perhaps I should point out that the way I envisage using generators as lightweight threads, the yields will typically be neither sending nor receiving values, but simply serving as suspension points. If a generator is actually producing values, then the phrase "yield from" has meaning, but if it's not -- if you're just using it to calculate a return value or cause a side effect -- it reads like nonsense. Maybe I'm more sensitive to such considerations than other people, but it bothers me. -- Greg From greg.ewing at canterbury.ac.nz Sat Oct 29 08:48:41 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 29 Oct 2011 19:48:41 +1300 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <1319768631.3605.50.camel@Gutsy> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> <1319768631.3605.50.camel@Gutsy> Message-ID: <4EABA1C9.90306@canterbury.ac.nz> Ron Adam wrote: > Another issue with this is, the routines and the framework become tied > together. The routines need to know the proper additional protocol to > work with that framework, and they also can't be used with any other > framework. An interesting thing about yield-from is that once you have it, the interface between the generators and the framework reduces to the iterator protocol, and there's really only *one* obvious way to do it. This means it may become feasible to have a single coroutine scheduler in the standard library, with hooks for adding various kinds of event sources, so that different libraries handling asynchronous events can work together, instead of each one wanting to be in charge an run its own event loop. -- Greg From steve at pearwood.info Sat Oct 29 08:52:44 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 29 Oct 2011 17:52:44 +1100 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA93D02.2030201@hotpy.org> <4EA94055.3080207@canterbury.ac.nz> <4EA94304.1010909@hotpy.org> <4EAB2994.6010103@canterbury.ac.nz> Message-ID: <4EABA2BC.9090101@pearwood.info> Nick Coghlan wrote: > PEP 3152 (and all generator based coroutines) have the limitation that > they can't suspend if there's a *Python* function on the stack. Can > you see why I know consider this approach categorically worse than one > that pursued the Lua approach? Can you give a concrete example of this problem? Even a toy example will do. The only thing I can think of is something like this: def function(co): value = 1000 for i in (1, 2, 3): value -= co.send(i) return value def coroutine(): value = (yield 1) while True: value += 1 print("suspending...") value += (yield value) print("waking...") >>> co = coroutine() >>> co.send(None) 1 >>> co.send(3) suspending... 4 >>> function(co) waking... suspending... waking... suspending... waking... suspending... 972 >>> co.send(10) waking... suspending... 24 But as far as I understand it, that seems to show the coroutine suspending even though a function is on the stack. Can you explain what you mean and/or how I have misunderstood? -- Steven From greg.ewing at canterbury.ac.nz Sat Oct 29 09:16:59 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 29 Oct 2011 20:16:59 +1300 Subject: [Python-ideas] Implementing Coroutines (was Cofunctions - Back to Basics) In-Reply-To: <4EAA76F0.6080105@hotpy.org> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA93D02.2030201@hotpy.org> <4EA94055.3080207@canterbury.ac.nz> <4EA94304.1010909@hotpy.org> <4EAA76F0.6080105@hotpy.org> Message-ID: <4EABA86B.7040504@canterbury.ac.nz> Mark Shannon wrote: > In summary, the cofunction proposal is a work-around for a limitation in > the VM. By fixing the VM we can have proper coroutines. > Surely, it is better to make the VM support the features we want/need > rather than bend those features to fit the VM? The original version of Stackless worked by "flattening" the interpreter the way you suggest, but it was considered far too big an upheaval to consider incorporating into CPython, especially since a lot of extension code would also have to have been rewritten to make it coroutine-friendly. -- Greg From greg.ewing at canterbury.ac.nz Sat Oct 29 09:21:36 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 29 Oct 2011 20:21:36 +1300 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> <4EA9FED3.6050505@pearwood.info> Message-ID: <4EABA980.6020201@canterbury.ac.nz> Nick Coghlan wrote: > If you can't merge the synchronous and asynchronous version of your > I/O routines, it means you end up having to write everything in the > entire stack twice - once in an "event loop friendly" way ... and > once in the normal procedural way. Okay, I understand what you're concerned about now. Yes, that's a nuisance, and it would be very nice to be able to avoid it. Unfortunately, I don't know how to implement your suggestion in a fully general way without either resorting to dubious C-level hackery as greenlets do, or turning the entire architecture of CPython inside out as the original version of Stackless did. What I'm trying to do is see how close I can get to the ideal, without hackery, by building on already-established mechanisms in CPython and making as few changes as possible. -- Greg From greg.ewing at canterbury.ac.nz Sat Oct 29 09:37:36 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 29 Oct 2011 20:37:36 +1300 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA93D02.2030201@hotpy.org> <4EA94055.3080207@canterbury.ac.nz> <4EA94304.1010909@hotpy.org> <4EAB2994.6010103@canterbury.ac.nz> Message-ID: <4EABAD40.1010002@canterbury.ac.nz> Nick Coghlan wrote: > The limitation of Lua style coroutines is that they can't be suspended > from inside a function implemented in C. Without greenlets/Stackless > style assembly code, coroutines in Python would likely have the same > limitation. > > PEP 3152 (and all generator based coroutines) have the limitation that > they can't suspend if there's a *Python* function on the stack. Can > you see why I know consider this approach categorically worse than one > that pursued the Lua approach? Ouch, yes, point taken. Fortunately, I think I may have an answer to this... Now that the cocall syntax is gone, the bytecode generated for a cofunction is actually identical to that of an ordinary function. The only difference is a flag in the code object. If the flag were moved into the stack frame instead, it would be possible to run any function in either "normal" or "coroutine" mode, depending on whether it was invoked via __call__ or __cocall__. So there would no longer be two kinds of function, no need for 'codef', and any pure-Python code could be used either way. This wouldn't automatically handle the problem of C code -- existing C functions would run in "normal" mode and therefore wouldn't be able to yield. However, there is at least a clear way for C-implemented objects to participate, by providing a __cocall__ method that returns an iterator. -- Greg From greg.ewing at canterbury.ac.nz Sat Oct 29 08:16:53 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 29 Oct 2011 19:16:53 +1300 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA93D02.2030201@hotpy.org> <4EA94055.3080207@canterbury.ac.nz> <4EA94304.1010909@hotpy.org> Message-ID: <4EAB9A55.4070909@canterbury.ac.nz> Nick Coghlan wrote: > This means that, whenever a generator yields, that stack > needs to be unwound, suspending each affected generator in turn, > strung together by references between the generator objects rather > than remaining a true frame stack. The chain of generator frames that results from nested yield-from calls *is* a stack of frames -- it's just linked in the opposite order from the way frames are usually arranged. The "top" of the stack is at the tail, making it slightly less convenient to get to, but that's just an issue of implementation efficiency. -- Greg From ncoghlan at gmail.com Sat Oct 29 12:22:21 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 29 Oct 2011 20:22:21 +1000 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <4EABAD40.1010002@canterbury.ac.nz> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA93D02.2030201@hotpy.org> <4EA94055.3080207@canterbury.ac.nz> <4EA94304.1010909@hotpy.org> <4EAB2994.6010103@canterbury.ac.nz> <4EABAD40.1010002@canterbury.ac.nz> Message-ID: On Sat, Oct 29, 2011 at 5:37 PM, Greg Ewing wrote: > If the flag were moved into the stack frame instead, it would > be possible to run any function in either "normal" or "coroutine" > mode, depending on whether it was invoked via __call__ or > __cocall__. > > So there would no longer be two kinds of function, no need for > 'codef', and any pure-Python code could be used either way. > > This wouldn't automatically handle the problem of C code -- > existing C functions would run in "normal" mode and therefore > wouldn't be able to yield. However, there is at least a clear > way for C-implemented objects to participate, by providing > a __cocall__ method that returns an iterator. Ah, now we're getting somewhere :) OK, so in this approach, *any* Python function could potentially be a coroutine - it would depend on how it was invoked rather than any inherent property of the function definition. An ordinary call would be unchanged, while a cocall would set a flag on the new stack frame to say that this is a coroutine. Yes, I think that's a good step forward. The behaviour of call() syntax in the eval loop would then depend on whether or not the flag was set on the frame. This would add a tiny amount of overhead to all function calls (to check the new flag), but potentially *does* solve the language bifurcation problem (with some additional machinery). However, I think we still potentially have a problem due to the overloading of a single communications channel (i.e. using 'yield' both to suspend the entire coroutine, but also to return values to the next layer out). To illustrate that, I'll repeat the toy example I posted earlier in the thread: # The intervening function that we want to "just work" as part of a coroutine def print_all(iterable): for item in iterable: print(item) # That means we need "iterable" to be able to: # - return items to 'print_all' to be displayed # - suspend the entire coroutine in order to request more data # Now, consider if our 'iterable' was an instance of the following generator def data_iterable(get_data, sentinel=None): while 1: x = get_data() if x is sentinel: break yield x In coroutine mode, the for loop would implicitly invoke iterable.__next__.__cocall__(). The __cocall__ implementations on generator object methods are going to need a way to tell the difference between requests from the generator body to yield a value (which means the __cocall__ should halt, and the value be returned) and requests to suspend the coroutine emanating from the "get_data()" call (which means the __cocall__ should yield the value provided by the frame). That's where I think 'coyield' could come in - it would tell the generator __cocall__ functionality it should pass the suspension request up the chain instead of returning the value to the caller of next/send/throw. This could also be done via a function and a special object type that the generator __cocall__ implementation recognised. The limitation would then just be that any operation invoked via __call__ (typically a C function with no __cocall__ method) would prevent suspension of the coroutine. The end result would actually look a lot like Lua coroutines, but with an additional calling protocol that allowed C extensions to participate if they wanted to. I think eventually the PEP should move towards a more explanatory model: - "coroutines are a tool for lightweight cooperative multitasking" - "coroutines are cool for a variety of reasons (aka people didn't create Twisted, Stackless, greenlets and gevent just for fun)" - "here's how this PEP proposes this functionality should look (aka Lua's coroutines look pretty nice)" - "rather than saving and switching entire stack frames as a unit, Python's generator model supports coroutines by requiring that every frame on the stack know how to suspend itself for later resumption" - "doing this explicitly can be quite clumsy, so this PEP proposes a new protocol for invoking arbitrary functions in a way that sets them up for cooperative multitasking rather than assuming they will run to completion immediately" - "this approach allows C extensions to play nicely with coroutines (by implementing __cocall__ appropriately), but doesn't require low level assembly code the way Stackless/greenlets do" Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From p.f.moore at gmail.com Sat Oct 29 12:51:55 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 29 Oct 2011 11:51:55 +0100 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA93D02.2030201@hotpy.org> <4EA94055.3080207@canterbury.ac.nz> <4EA94304.1010909@hotpy.org> <4EAB2994.6010103@canterbury.ac.nz> <4EABAD40.1010002@canterbury.ac.nz> Message-ID: On 29 October 2011 11:22, Nick Coghlan wrote: > I think eventually the PEP should move towards a more explanatory model: > - "coroutines are a tool for lightweight cooperative multitasking" > - "coroutines are cool for a variety of reasons (aka people didn't > create Twisted, Stackless, greenlets and gevent just for fun)" > - "here's how this PEP proposes this functionality should look (aka > Lua's coroutines look pretty nice)" > - "rather than saving and switching entire stack frames as a unit, > Python's generator model supports coroutines by requiring that every > frame on the stack know how to suspend itself for later resumption" > - "doing this explicitly can be quite clumsy, so this PEP proposes a > new protocol for invoking arbitrary functions in a way that sets them > up for cooperative multitasking rather than assuming they will run to > completion immediately" > - "this approach allows C extensions to play nicely with coroutines > (by implementing __cocall__ appropriately), but doesn't require low > level assembly code the way Stackless/greenlets do" That looks reasonable. There's one further piece of the puzzle I'd like to see covered, though - with the language support as defined in the PEP, can the functionality be implemented library-style (like Lua) or does it need new syntax? The PEP should discuss this, and if syntax is needed, should propose the appropriate syntax. I think that if the runtime support can be built in a way that allows a Lua-style function/method approach, then that should be the initial design, as it's easier to tweak a functional API than to change syntax. If experience shows that code would benefit from syntax support, add that later. Paul. From ncoghlan at gmail.com Sat Oct 29 16:48:13 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 30 Oct 2011 00:48:13 +1000 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA93D02.2030201@hotpy.org> <4EA94055.3080207@canterbury.ac.nz> <4EA94304.1010909@hotpy.org> <4EAB2994.6010103@canterbury.ac.nz> <4EABAD40.1010002@canterbury.ac.nz> Message-ID: On Sat, Oct 29, 2011 at 8:51 PM, Paul Moore wrote: > I think that if the runtime support can be built in a way that allows > a Lua-style function/method approach, then that should be the initial > design, as it's easier to tweak a functional API than to change > syntax. If experience shows that code would benefit from syntax > support, add that later. I have one specific reason I think the new yield variant should get a new keyword: it's a new kind of flow control, and Python has a history of trying to keep flow control explicit (cf. the PEP 343 with statement discussions). At the *call* level, it wouldn't need a keyword, since cocall(x) could just be a wrapper around x.__cocall__(). This would be similar to the situation with generators - yielding from a generator has dedicated syntax (in the form of 'yield'), but invoking one just uses the standard call and iterator syntax. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From p.f.moore at gmail.com Sat Oct 29 18:22:35 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 29 Oct 2011 17:22:35 +0100 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA93D02.2030201@hotpy.org> <4EA94055.3080207@canterbury.ac.nz> <4EA94304.1010909@hotpy.org> <4EAB2994.6010103@canterbury.ac.nz> <4EABAD40.1010002@canterbury.ac.nz> Message-ID: On 29 October 2011 15:48, Nick Coghlan wrote: > On Sat, Oct 29, 2011 at 8:51 PM, Paul Moore wrote: >> I think that if the runtime support can be built in a way that allows >> a Lua-style function/method approach, then that should be the initial >> design, as it's easier to tweak a functional API than to change >> syntax. If experience shows that code would benefit from syntax >> support, add that later. > > I have one specific reason I think the new yield variant should get a > new keyword: it's a new kind of flow control, and Python has a history > of trying to keep flow control explicit (cf. the PEP 343 with > statement discussions). That's a reasonable point, and could be explicitly noted in the PEP. Paul. From Nikolaus at rath.org Sat Oct 29 20:24:20 2011 From: Nikolaus at rath.org (Nikolaus Rath) Date: Sat, 29 Oct 2011 14:24:20 -0400 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <4EAB5268.1010803@pearwood.info> (Steven D'Aprano's message of "Sat, 29 Oct 2011 12:10:00 +1100") References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <4EAB3CB2.4030805@canterbury.ac.nz> <4EAB5268.1010803@pearwood.info> Message-ID: <87y5w364h7.fsf@vostro.rath.org> Steven D'Aprano writes: > I'm not hostile to the idea. But the PEP needs to reference concrete > examples that people can run to see that there is a problem to be > solved. Otherwise, it just seems that you're proposing adding more > complication to the Python, at least one new keyword and one more > builtin, for what gain? For me the first example in the greenlet documentation was very helpful: http://packages.python.org/greenlet/ Best, -Nikolaus -- ?Time flies like an arrow, fruit flies like a Banana.? PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C From solipsis at pitrou.net Sun Oct 30 00:18:01 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 30 Oct 2011 00:18:01 +0200 Subject: [Python-ideas] PEP 3155 - Qualified name for classes and functions Message-ID: <20111030001801.2f52ceb2@pitrou.net> Hello, I would like to propose the following PEP for discussion and, if possible, acceptance. I think the proposal shouldn't be too controversial (I find it quite simple and straightforward myself :-)). I also have a draft implementation that's quite simple (http://hg.python.org/features/pep-3155). Regards Antoine. PEP: 3155 Title: Qualified name for classes and functions Version: $Revision$ Last-Modified: $Date$ Author: Antoine Pitrou Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 2011-10-29 Python-Version: 3.3 Post-History: Resolution: TBD Rationale ========= Python's introspection facilities have long had poor support for nested classes. Given a class object, it is impossible to know whether it was defined inside another class or at module top-level; and, if the former, it is also impossible to know in which class it was defined. While use of nested classes is often considered poor style, the only reason for them to have second class introspection support is a lousy pun. Python 3 adds insult to injury by dropping what was formerly known as unbound methods. In Python 2, given the following definition:: class C: def f(): pass you can then walk up from the ``C.f`` object to its defining class:: >>> C.f.im_class This possibility is gone in Python 3:: >>> C.f.im_class Traceback (most recent call last): File "", line 1, in AttributeError: 'function' object has no attribute 'im_class' >>> dir(C.f) ['__annotations__', '__call__', '__class__', '__closure__', '__code__', '__defaults__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__get__', '__getattribute__', '__globals__', '__gt__', '__hash__', '__init__', '__kwdefaults__', '__le__', '__lt__', '__module__', '__name__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__'] This limits again the introspection capabilities available to the user. It can produce actual issues when porting software to Python 3, for example Twisted Core where the issue of introspecting method objects came up several times. It also limits pickling support [1]_. Proposal ======== This PEP proposes the addition of a ``__qname__`` attribute to functions and classes. For top-level functions and classes, the ``__qname__`` attribute is equal to the ``__name__`` attribute. For nested classed, methods, and nested functions, the ``__qname__`` attribute contains a dotted path leading to the object from the module top-level. The repr() and str() of functions and classes is modified to use ``__qname__`` rather than ``__name__``. Example with nested classes --------------------------- >>> class C: ... def f(): pass ... class D: ... def g(): pass ... >>> C.__qname__ 'C' >>> C.f.__qname__ 'C.f' >>> C.D.__qname__ 'C.D' >>> C.D.g.__qname__ 'C.D.g' Example with nested functions ----------------------------- >>> def f(): ... def g(): pass ... return g ... >>> f.__qname__ 'f' >>> f().__qname__ 'f.g' Limitations =========== With nested functions (and classes defined inside functions), the dotted path will not be walkable programmatically as a function's namespace is not available from the outside. It will still be more helpful to the human reader than the bare ``__name__``. As the ``__name__`` attribute, the ``__qname__`` attribute is computed statically and it will not automatically follow rebinding. References ========== .. [1] "pickle should support methods": http://bugs.python.org/issue9276 Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: From jxo6948 at rit.edu Sun Oct 30 00:47:13 2011 From: jxo6948 at rit.edu (John O'Connor) Date: Sat, 29 Oct 2011 18:47:13 -0400 Subject: [Python-ideas] Allow iterable argument to os.walk() Message-ID: Given the push towards iterators in 3.0, is anyone in support of allowing an iterable for the "top" argument in os.walk??It seems like it would be common to look in more than one directory at once. - John From python at mrabarnett.plus.com Sun Oct 30 00:47:38 2011 From: python at mrabarnett.plus.com (MRAB) Date: Sat, 29 Oct 2011 23:47:38 +0100 Subject: [Python-ideas] PEP 3155 - Qualified name for classes and functions In-Reply-To: <20111030001801.2f52ceb2@pitrou.net> References: <20111030001801.2f52ceb2@pitrou.net> Message-ID: <4EAC828A.5070705@mrabarnett.plus.com> On 29/10/2011 23:18, Antoine Pitrou wrote: > Proposal > ======== > > This PEP proposes the addition of a ``__qname__`` attribute to functions > and classes. For top-level functions and classes, the ``__qname__`` > attribute is equal to the ``__name__`` attribute. For nested classed, > methods, and nested functions, the ``__qname__`` attribute contains a > dotted path leading to the object from the module top-level. > > The repr() and str() of functions and classes is modified to use ``__qname__`` > rather than ``__name__``. > [snip] The only criticism I have is that I think I'd prefer it if it were called something more like "__qualname__". From ncoghlan at gmail.com Sun Oct 30 01:32:18 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 30 Oct 2011 09:32:18 +1000 Subject: [Python-ideas] Allow iterable argument to os.walk() In-Reply-To: References: Message-ID: On Sun, Oct 30, 2011 at 8:47 AM, John O'Connor wrote: > Given the push towards iterators in 3.0, is anyone in support of > allowing an iterable for the "top" argument in os.walk??It seems like > it would be common to look in more than one directory at once. No, because it means you end up having to special case strings so they're treated atomically. We already do that in a few string-specific APIs (e.g. startswith()/endswith()), but it's still not a particularly nice pattern. If people want to walk multiple directories, 3.3 will make that pretty easy: def walk_dirs(dirs): for dir in dirs: yield from os.walk(dir) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun Oct 30 01:52:15 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 30 Oct 2011 09:52:15 +1000 Subject: [Python-ideas] PEP 3155 - Qualified name for classes and functions In-Reply-To: <20111030001801.2f52ceb2@pitrou.net> References: <20111030001801.2f52ceb2@pitrou.net> Message-ID: On Sun, Oct 30, 2011 at 8:18 AM, Antoine Pitrou wrote: > > Hello, > > I would like to propose the following PEP for discussion and, if > possible, acceptance. I think the proposal shouldn't be too > controversial (I find it quite simple and straightforward myself :-)). Indeed (and I believe this should make the next version of the module aliasing PEP substantially shorter!). > Proposal > ======== > > This PEP proposes the addition of a ``__qname__`` attribute to functions > and classes. ?For top-level functions and classes, the ``__qname__`` > attribute is equal to the ``__name__`` attribute. ?For nested classed, > methods, and nested functions, the ``__qname__`` attribute contains a > dotted path leading to the object from the module top-level. I like '__qname__'. While I'm sympathetic to the suggestion of the more explicit '__qualname__', I actually prefer the idea of adding "qname" as an official shorthand for "qualified name" in the glossary. > Example with nested classes > --------------------------- > >>>> class C: > ... ? def f(): pass > ... ? class D: > ... ? ? def g(): pass > ... >>>> C.__qname__ > 'C' >>>> C.f.__qname__ > 'C.f' >>>> C.D.__qname__ > 'C.D' >>>> C.D.g.__qname__ > 'C.D.g' > > Example with nested functions > ----------------------------- > >>>> def f(): > ... ? def g(): pass > ... ? return g > ... >>>> f.__qname__ > 'f' >>>> f().__qname__ > 'f.g' For nested functions, I suggest adding something to the qname to directly indicate that the scope is hidden. Adding parentheses to the name of the outer function would probably work: f().g Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From greg.ewing at canterbury.ac.nz Sun Oct 30 02:39:33 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 30 Oct 2011 13:39:33 +1300 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA93D02.2030201@hotpy.org> <4EA94055.3080207@canterbury.ac.nz> <4EA94304.1010909@hotpy.org> <4EAB2994.6010103@canterbury.ac.nz> <4EABAD40.1010002@canterbury.ac.nz> Message-ID: <4EAC9CC5.5010101@canterbury.ac.nz> Paul Moore wrote: > There's one further piece of the puzzle I'd > like to see covered, though - with the language support as defined in > the PEP, can the functionality be implemented library-style (like Lua) > or does it need new syntax? While it might not need new syntax, it does hinge fundamentally on changing the way the interpreter loop works, so I don't think a library implementation would be possible. -- Greg From greg.ewing at canterbury.ac.nz Sun Oct 30 07:02:36 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 30 Oct 2011 19:02:36 +1300 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA93D02.2030201@hotpy.org> <4EA94055.3080207@canterbury.ac.nz> <4EA94304.1010909@hotpy.org> <4EAB2994.6010103@canterbury.ac.nz> <4EABAD40.1010002@canterbury.ac.nz> Message-ID: <4EACE87C.6060207@canterbury.ac.nz> Nick Coghlan wrote: > However, I think we still potentially have a > problem due to the overloading of a single communications channel > (i.e. using 'yield' both to suspend the entire coroutine, but also to > return values to the next layer out). Bugger, you're right. With sufficient cleverness, I think it could be made to work, but it occurs to me that this is really a special case of a more general problem. For example, an __add__ method implemented in Python wouldn't be able to suspend a coroutine. So we would be over-promising somewhat if we claimed that *any* pure-python code could be suspended with 'coyield'. It would be more like "any pure-python code, as long as no special method call is involved anywhere along the call chain". Fixing that would require making some rather extensive changes. Either every type slot would need to get a cocall counterpart, or their signatures would need an argument to indicate that a cocall was being made, or something like that. -- Greg From ncoghlan at gmail.com Sun Oct 30 07:07:14 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 30 Oct 2011 16:07:14 +1000 Subject: [Python-ideas] PEP 395 - Module Aliasing Message-ID: I've updated the module aliasing PEP to be based on the terminology in Antoine's qualified names PEP. The full text is included below, or you can read it on python.org: http://www.python.org/dev/peps/pep-0395/ Cheers, Nick. ================================ PEP: 395 Title: Module Aliasing Version: $Revision$ Last-Modified: $Date$ Author: Nick Coghlan Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 4-Mar-2011 Python-Version: 3.3 Post-History: 5-Mar-2011, 30-Oct-2011 Abstract ======== This PEP proposes new mechanisms that eliminate some longstanding traps for the unwary when dealing with Python's import system, the pickle module and introspection interfaces. It builds on the "Qualified Name" concept defined in PEP 3155. What's in a ``__name__``? ========================= Over time, a module's ``__name__`` attribute has come to be used to handle a number of different tasks. The key use cases identified for this module attribute are: 1. Flagging the main module in a program, using the ``if __name__ == "__main__":`` convention. 2. As the starting point for relative imports 3. To identify the location of function and class definitions within the running application 4. To identify the location of classes for serialisation into pickle objects which may be shared with other interpreter instances Traps for the Unwary ==================== The overloading of the semantics of ``__name__`` have resulted in several traps for the unwary. These traps can be quite annoying in practice, as they are highly unobvious and can cause quite confusing behaviour. A lot of the time, you won't even notice them, which just makes them all the more surprising when they do come up. Importing the main module twice ------------------------------- The most venerable of these traps is the issue of (effectively) importing ``__main__`` twice. This occurs when the main module is also imported under its real name, effectively creating two instances of the same module under different names. This problem used to be significantly worse due to implicit relative imports from the main module, but the switch to allowing only absolute imports and explicit relative imports means this issue is now restricted to affecting the main module itself. Why are my relative imports broken? ----------------------------------- PEP 366 defines a mechanism that allows relative imports to work correctly when a module inside a package is executed via the ``-m`` switch. Unfortunately, many users still attempt to directly execute scripts inside packages. While this no longer silently does the wrong thing by creating duplicate copies of peer modules due to implicit relative imports, it now fails noisily at the first explicit relative import, even though the interpreter actually has sufficient information available on the filesystem to make it work properly. In a bit of a pickle -------------------- Something many users may not realise is that the ``pickle`` module serialises objects based on the ``__name__`` of the containing module. So objects defined in ``__main__`` are pickled that way, and won't be unpickled correctly by another python instance that only imported that module instead of running it directly. This behaviour is the underlying reason for the advice from many Python veterans to do as little as possible in the ``__main__`` module in any application that involves any form of object serialisation and persistence. Similarly, when creating a pseudo-module\*, pickles rely on the name of the module where a class is actually defined, rather than the officially documented location for that class in the module hierarchy. While this PEP focuses specifically on ``pickle`` as the principal serialisation scheme in the standard library, this issue may also affect other mechanisms that support serialisation of arbitrary class instances. \*For the purposes of this PEP, a "pseudo-module" is a package designed like the Python 3.2 ``unittest`` and ``concurrent.futures`` packages. These packages are documented as if they were single modules, but are in fact internally implemented as a package. This is *supposed* to be an implementation detail that users and other implementations don't need to worry about, but, thanks to ``pickle`` (and serialisation in general), the details are exposed and effectively become part of the public API. Where's the source? ------------------- Some sophisticated users of the pseudo-module technique described above recognise the problem with implementation details leaking out via the ``pickle`` module, and choose to address it by altering ``__name__`` to refer to the public location for the module before defining any functions or classes (or else by modifying the ``__module__`` attributes of those objects after they have been defined). This approach is effective at eliminating the leakage of information via pickling, but comes at the cost of breaking introspection for functions and classes (as their ``__module__`` attribute now points to the wrong place). Forkless Windows ---------------- To get around the lack of ``os.fork`` on Windows, the ``multiprocessing`` module attempts to re-execute Python with the same main module, but skipping over any code guarded by ``if __name__ == "__main__":`` checks. It does the best it can with the information it has, but is forced to make assumptions that simply aren't valid whenever the main module isn't an ordinary directly executed script or top-level module. Packages and non-top-level modules executed via the ``-m`` switch, as well as directly executed zipfiles or directories, are likely to make multiprocessing on Windows do the wrong thing (either quietly or noisily) when spawning a new process. While this issue currently only affects Windows directly, it also impacts any proposals to provide Windows-style "clean process" invocation via the multiprocessing module on other platforms. Proposed Changes ================ The following changes are interrelated and make the most sense when considered together. They collectively either completely eliminate the traps for the unwary noted above, or else provide straightforward mechanisms for dealing with them. A rough draft of some of the concepts presented here was first posted on the python-ideas list [1], but they have evolved considerably since first being discussed in that thread. Fixing dual imports of the main module -------------------------------------- Two simple changes are proposed to fix this problem: 1. In ``runpy``, modify the implementation of the ``-m`` switch handling to install the specified module in ``sys.modules`` under both its real name and the name ``__main__``. (Currently it is only installed as the latter) 2. When directly executing a module, install it in ``sys.modules`` under ``os.path.splitext(os.path.basename(__file__))[0]`` as well as under ``__main__``. With the main module also stored under its "real" name, attempts to import it will pick it up from the ``sys.modules`` cache rather than reimporting it under the new name. Fixing direct execution inside packages --------------------------------------- To fix this problem, it is proposed that an additional filesystem check be performed before proceeding with direct execution of a ``PY_SOURCE`` or ``PY_COMPILED`` file that has been named on the command line. This additional check would look for an ``__init__`` file that is a peer to the specified file with a matching extension (either ``.py``, ``.pyc`` or ``.pyo``, depending what was passed on the command line). If this check fails to find anything, direct execution proceeds as usual. If, however, it finds something, execution is handed over to a helper function in the ``runpy`` module that ``runpy.run_path`` also invokes in the same circumstances. That function will walk back up the directory hierarchy from the supplied path, looking for the first directory that doesn't contain an ``__init__`` file. Once that directory is found, it will be set to ``sys.path[0]``, ``sys.argv[0]`` will be set to ``-m`` and ``runpy._run_module_as_main`` will be invoked with the appropriate module name (as calculated based on the original filename and the directories traversed while looking for a directory without an ``__init__`` file). The two current PEPs for namespace packages (PEP 382 and PEP 402) would both affect this part of the proposal. For PEP 382 (with its current suggestion of "*.pyp" package directories, this check would instead just walk up the supplied path, looking for the first non-package directory (this would not require any filesystem stat calls). Since PEP 402 deliberately omits explicit directory markers, it would need an alternative approach, based on checking the supplied path against the contents of ``sys.path``. In both cases, the direct execution behaviour can still be corrected. Fixing pickling without breaking introspection ---------------------------------------------- To fix this problem, it is proposed to add a new optional module level attribute: ``__qname__``. This abbreviation of "qualified name" is taken from PEP 3155, where it is used to store the naming path to a nested class or function definition relative to the top level module. By default, ``__qname__`` will be the same as ``__name__``, which covers the typical case where there is a one-to-one correspondence between the documented API and the actual module implementation. Functions and classes will gain a corresponding ``__qmodule__`` attribute that refers to their module's ``__qname__``. Pseudo-modules that adjust ``__name__`` to point to the public namespace will leave ``__qname__`` untouched, so the implementation location remains readily accessible for introspection. In the main module, ``__qname__`` will automatically be set to the main module's "real" name (as described above under the fix to prevent duplicate imports of the main module) by the interpreter. At the interactive prompt, both ``__name__`` and ``__qname__`` will be set to ``"__main__"``. These changes on their own will fix most pickling and serialisation problems, but one additional change is needed to fix the problem with serialisation of items in ``__main__``: as a slight adjustment to the definition process for functions and classes, in the ``__name__ == "__main__"`` case, the module ``__qname__`` attribute will be used to set ``__module__``. ``pydoc`` and ``inspect`` would also be updated appropriately to: - use ``__qname__`` instead of ``__name__`` and ``__qmodule__`` instead of ``__module__``where appropriate (e.g. ``inspect.getsource()`` would prefer the qualified variants) - report both the public names and the qualified names for affected objects Fixing multiprocessing on Windows --------------------------------- With ``__qname__`` now available to tell ``multiprocessing`` the real name of the main module, it should be able to simply include it in the serialised information passed to the child process, eliminating the need for dubious reverse engineering of the ``__file__`` attribute. Reference Implementation ======================== None as yet. References ========== .. [1] Module aliases and/or "real names" (http://mail.python.org/pipermail/python-ideas/2011-January/008983.html) Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun Oct 30 07:11:58 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 30 Oct 2011 16:11:58 +1000 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <4EACE87C.6060207@canterbury.ac.nz> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA93D02.2030201@hotpy.org> <4EA94055.3080207@canterbury.ac.nz> <4EA94304.1010909@hotpy.org> <4EAB2994.6010103@canterbury.ac.nz> <4EABAD40.1010002@canterbury.ac.nz> <4EACE87C.6060207@canterbury.ac.nz> Message-ID: On Sun, Oct 30, 2011 at 4:02 PM, Greg Ewing wrote: > Nick Coghlan wrote: >> >> However, I think we still potentially have a >> problem due to the overloading of a single communications channel >> (i.e. using 'yield' both to suspend the entire coroutine, but also to >> return values to the next layer out). > > Bugger, you're right. > > With sufficient cleverness, I think it could be made > to work, but it occurs to me that this is really a > special case of a more general problem. For example, > an __add__ method implemented in Python wouldn't be > able to suspend a coroutine. > > So we would be over-promising somewhat if we claimed > that *any* pure-python code could be suspended with > 'coyield'. It would be more like "any pure-python > code, as long as no special method call is involved > anywhere along the call chain". > > Fixing that would require making some rather extensive > changes. Either every type slot would need to get a > cocall counterpart, or their signatures would need an > argument to indicate that a cocall was being made, > or something like that. Ouch, I'd missed that completely. Supporting a few assembly hacks in the core isn't looking so bad at this point ;) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ericsnowcurrently at gmail.com Sun Oct 30 07:13:02 2011 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sun, 30 Oct 2011 00:13:02 -0600 Subject: [Python-ideas] PEP 3155 - Qualified name for classes and functions In-Reply-To: <20111030001801.2f52ceb2@pitrou.net> References: <20111030001801.2f52ceb2@pitrou.net> Message-ID: On Oct 29, 2011 4:22 PM, "Antoine Pitrou" wrote: > > > Hello, > > I would like to propose the following PEP for discussion and, if > possible, acceptance. I think the proposal shouldn't be too > controversial (I find it quite simple and straightforward myself :-)). > > I also have a draft implementation that's quite simple > (http://hg.python.org/features/pep-3155). > > Regards > > Antoine. > > > > PEP: 3155 > Title: Qualified name for classes and functions > Version: $Revision$ > Last-Modified: $Date$ > Author: Antoine Pitrou > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 2011-10-29 > Python-Version: 3.3 > Post-History: > Resolution: TBD > > > Rationale > ========= > > Python's introspection facilities have long had poor support for nested > classes. Given a class object, it is impossible to know whether it was > defined inside another class or at module top-level; and, if the former, > it is also impossible to know in which class it was defined. While > use of nested classes is often considered poor style, the only reason > for them to have second class introspection support is a lousy pun. > > Python 3 adds insult to injury by dropping what was formerly known as > unbound methods. In Python 2, given the following definition:: > > class C: > def f(): > pass > > you can then walk up from the ``C.f`` object to its defining class:: > > >>> C.f.im_class > > > This possibility is gone in Python 3:: > > >>> C.f.im_class > Traceback (most recent call last): > File "", line 1, in > AttributeError: 'function' object has no attribute 'im_class' > >>> dir(C.f) > ['__annotations__', '__call__', '__class__', '__closure__', '__code__', > '__defaults__', '__delattr__', '__dict__', '__dir__', '__doc__', > '__eq__', '__format__', '__ge__', '__get__', '__getattribute__', > '__globals__', '__gt__', '__hash__', '__init__', '__kwdefaults__', > '__le__', '__lt__', '__module__', '__name__', '__ne__', '__new__', > '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', > '__str__', '__subclasshook__'] > > This limits again the introspection capabilities available to the user. > It can produce actual issues when porting software to Python 3, for example > Twisted Core where the issue of introspecting method objects came up > several times. It also limits pickling support [1]_. > > > Proposal > ======== > > This PEP proposes the addition of a ``__qname__`` attribute to functions > and classes. For top-level functions and classes, the ``__qname__`` > attribute is equal to the ``__name__`` attribute. For nested classed, > methods, and nested functions, the ``__qname__`` attribute contains a > dotted path leading to the object from the module top-level. > > The repr() and str() of functions and classes is modified to use ``__qname__`` > rather than ``__name__``. > > Example with nested classes > --------------------------- > > >>> class C: > ... def f(): pass > ... class D: > ... def g(): pass > ... > >>> C.__qname__ > 'C' > >>> C.f.__qname__ > 'C.f' > >>> C.D.__qname__ > 'C.D' > >>> C.D.g.__qname__ > 'C.D.g' > > Example with nested functions > ----------------------------- > > >>> def f(): > ... def g(): pass > ... return g > ... > >>> f.__qname__ > 'f' > >>> f().__qname__ > 'f.g' > > > Limitations > =========== > > With nested functions (and classes defined inside functions), the dotted > path will not be walkable programmatically as a function's namespace is not > available from the outside. It will still be more helpful to the human > reader than the bare ``__name__``. > If it helps, I have a patch that adds f_func to frame objects [1]. It points to the function that was called, so __qname__ could reference that function. -eric [1] see http://bugs.python.org/issue12857 > As the ``__name__`` attribute, the ``__qname__`` attribute is computed > statically and it will not automatically follow rebinding. > > > References > ========== > > .. [1] "pickle should support methods": > http://bugs.python.org/issue9276 > > Copyright > ========= > > This document has been placed in the public domain. > > > > .. > Local Variables: > mode: indented-text > indent-tabs-mode: nil > sentence-end-double-space: t > fill-column: 70 > coding: utf-8 > End: > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From mwm at mired.org Sun Oct 30 11:29:53 2011 From: mwm at mired.org (Mike Meyer) Date: Sun, 30 Oct 2011 03:29:53 -0700 Subject: [Python-ideas] Should this be considered a bug? Message-ID: <79f3fbb5-4b89-4d1d-95cc-ec1286f16fe9@email.android.com> h = [1, 2, 3] d = dict(a=1, b=2) h += d # works h = h + d # exception -- Sent from my Android tablet with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Oct 30 11:55:29 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 30 Oct 2011 20:55:29 +1000 Subject: [Python-ideas] Should this be considered a bug? In-Reply-To: <79f3fbb5-4b89-4d1d-95cc-ec1286f16fe9@email.android.com> References: <79f3fbb5-4b89-4d1d-95cc-ec1286f16fe9@email.android.com> Message-ID: On Sun, Oct 30, 2011 at 8:29 PM, Mike Meyer wrote: > > h = [1, 2, 3] > d = dict(a=1, b=2) > h += d # works > h = h + d # exception No, the reason the latter fails is because it's unclear what the type of the result should be and the interpreter refuses to guess. In the former case, there's no such ambiguity (since the type of 'h' simply stays the same). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Sun Oct 30 12:04:03 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 30 Oct 2011 12:04:03 +0100 Subject: [Python-ideas] PEP 3155 - Qualified name for classes and functions References: <20111030001801.2f52ceb2@pitrou.net> Message-ID: <20111030120403.0ca216c3@pitrou.net> On Sun, 30 Oct 2011 00:13:02 -0600 Eric Snow wrote: > > If it helps, I have a patch that adds f_func to frame objects [1]. It > points to the function that was called, so __qname__ could reference that > function. Perhaps, although my current implementation uses a bit of code in the compiler to precompute (most of) the __qname__ attribute. Regards Antoine. From cmjohnson.mailinglist at gmail.com Sun Oct 30 12:08:31 2011 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Sun, 30 Oct 2011 01:08:31 -1000 Subject: [Python-ideas] Should this be considered a bug? In-Reply-To: References: <79f3fbb5-4b89-4d1d-95cc-ec1286f16fe9@email.android.com> Message-ID: <2F2E6263-3BE9-41EF-AA20-467842B1C8FC@gmail.com> On Oct 30, 2011, at 12:55 AM, Nick Coghlan wrote: > On Sun, Oct 30, 2011 at 8:29 PM, Mike Meyer wrote: >> >> h = [1, 2, 3] >> d = dict(a=1, b=2) >> h += d # works >> h = h + d # exception > > No, the reason the latter fails is because it's unclear what the type > of the result should be and the interpreter refuses to guess. In the > former case, there's no such ambiguity (since the type of 'h' simply > stays the same). What about the fact that dictionaries have no particular order? If you're expecting my_list += my_set to come out in a particular order (a common rookie mistake when dealing with dicts as well), you can be silently bitten. From g.brandl at gmx.net Sun Oct 30 12:10:39 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 30 Oct 2011 12:10:39 +0100 Subject: [Python-ideas] Should this be considered a bug? In-Reply-To: References: <79f3fbb5-4b89-4d1d-95cc-ec1286f16fe9@email.android.com> Message-ID: On 10/30/2011 11:55 AM, Nick Coghlan wrote: > On Sun, Oct 30, 2011 at 8:29 PM, Mike Meyer wrote: >> >> h = [1, 2, 3] >> d = dict(a=1, b=2) >> h += d # works >> h = h + d # exception > > No, the reason the latter fails is because it's unclear what the type > of the result should be and the interpreter refuses to guess. In the > former case, there's no such ambiguity (since the type of 'h' simply > stays the same). It's still a bit inconsistent though (e.g. it doesn't work with tuple's __iadd__). Also, "type stays the same" is not true in other instances, e.g. i = 1 i += 0.5 Still, I don't think we need to do anything about it. cheers, Georg From solipsis at pitrou.net Sun Oct 30 12:16:38 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 30 Oct 2011 12:16:38 +0100 Subject: [Python-ideas] Should this be considered a bug? References: <79f3fbb5-4b89-4d1d-95cc-ec1286f16fe9@email.android.com> Message-ID: <20111030121638.7f079f56@pitrou.net> On Sun, 30 Oct 2011 12:10:39 +0100 Georg Brandl wrote: > On 10/30/2011 11:55 AM, Nick Coghlan wrote: > > On Sun, Oct 30, 2011 at 8:29 PM, Mike Meyer wrote: > >> > >> h = [1, 2, 3] > >> d = dict(a=1, b=2) > >> h += d # works > >> h = h + d # exception > > > > No, the reason the latter fails is because it's unclear what the type > > of the result should be and the interpreter refuses to guess. In the > > former case, there's no such ambiguity (since the type of 'h' simply > > stays the same). > > It's still a bit inconsistent though (e.g. it doesn't work with tuple's > __iadd__). I think list.__iadd__ is simply the same as list.extend, which takes an arbitrary iterable (and therefore accepts dicts, sets, etc.). Regards Antoine. From arnodel at gmail.com Sun Oct 30 12:35:07 2011 From: arnodel at gmail.com (Arnaud Delobelle) Date: Sun, 30 Oct 2011 11:35:07 +0000 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <4EAB2C50.2090300@canterbury.ac.nz> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EAB2C50.2090300@canterbury.ac.nz> Message-ID: On 28 October 2011 23:27, Greg Ewing wrote: > Arnaud Delobelle wrote: > >> Hi, ?I've taken the liberty to translate your examples using a small >> utility that I wrote some time ago and modified to mimic your proposal >> (imperfectly, of course, since this runs on unpatched python). ?FWIW, >> you can see and download the resulting files here (.html files for >> viewing, .py files for downloading): >> >> ? ?http://www.marooned.org.uk/~arno/cofunctions/ > > Thanks for your efforts, although I'm not sure how much it > helps to see them this way -- the whole point is to enable > writing such code *without* going through any great > contortions. I suppose they serve as an example of what > you would be saved from writing. Don't worry, it didn't take me too long. I still think there's a value is seeing how the same can be achieved without the machinery in the PEP (and the yield from PEP). In particular, it makes it possible to discuss what limitations an approach has which the other doesn't. E.g. one important aspect is diagnosing problems when things went wrong (which, IIRC, was more prevalent in the initial discussion of this proposal on this list). This is mentioned in the "Motivation and Rationale" section, but there is no reason a priori why the same safeguards couldn't be put in place without the PEP. -- Arnaud From ncoghlan at gmail.com Sun Oct 30 13:17:06 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 30 Oct 2011 22:17:06 +1000 Subject: [Python-ideas] Should this be considered a bug? In-Reply-To: References: <79f3fbb5-4b89-4d1d-95cc-ec1286f16fe9@email.android.com> Message-ID: On Sun, Oct 30, 2011 at 9:10 PM, Georg Brandl wrote: > It's still a bit inconsistent though (e.g. it doesn't work with tuple's > __iadd__). Tuple doesn't have an __iadd__, it's immutable :) As Antoine noted, list.__iadd__ is actually just list.extend under the hood, and accepts an arbitrary iterable. We're a bit more restrictive with other mutable container types (e.g. set, bytearray), but in those cases, the corresponding methods also only accept a limited selection of object types. (There's also a longstanding limitation in the type coercion system where returning NotImplemented doesn't work properly for the sequence slots at the C level - you have to use the numeric slots instead. That issue should finally be resolved in 3.3) > Also, "type stays the same" is not true in other instances, e.g. > > i = 1 > i += 0.5 Again, that's an immutable type, and the various numeric types go to great lengths in order to play nicely together and negotiate the "correct" result type. Even there, we deliberately complain when we think something dodgy might be going on: >>> import decimal >>> decimal.Decimal(1) + 1.0 Traceback (most recent call last): File "", line 1, in TypeError: unsupported operand type(s) for +: 'Decimal' and 'float' Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From g.brandl at gmx.net Sun Oct 30 14:30:17 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 30 Oct 2011 14:30:17 +0100 Subject: [Python-ideas] Should this be considered a bug? In-Reply-To: References: <79f3fbb5-4b89-4d1d-95cc-ec1286f16fe9@email.android.com> Message-ID: On 10/30/2011 01:17 PM, Nick Coghlan wrote: > On Sun, Oct 30, 2011 at 9:10 PM, Georg Brandl wrote: >> It's still a bit inconsistent though (e.g. it doesn't work with tuple's >> __iadd__). > > Tuple doesn't have an __iadd__, it's immutable :) Ah yes, that might be the reason ;) > As Antoine noted, list.__iadd__ is actually just list.extend under the > hood, and accepts an arbitrary iterable. > > We're a bit more restrictive with other mutable container types (e.g. > set, bytearray), but in those cases, the corresponding methods also > only accept a limited selection of object types. Makes sense. Georg From mark at hotpy.org Sun Oct 30 16:34:56 2011 From: mark at hotpy.org (Mark Shannon) Date: Sun, 30 Oct 2011 15:34:56 +0000 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <4EAC9CC5.5010101@canterbury.ac.nz> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA93D02.2030201@hotpy.org> <4EA94055.3080207@canterbury.ac.nz> <4EA94304.1010909@hotpy.org> <4EAB2994.6010103@canterbury.ac.nz> <4EABAD40.1010002@canterbury.ac.nz> <4EAC9CC5.5010101@canterbury.ac.nz> Message-ID: <4EAD6EA0.40308@hotpy.org> Greg Ewing wrote: > Paul Moore wrote: >> There's one further piece of the puzzle I'd >> like to see covered, though - with the language support as defined in >> the PEP, can the functionality be implemented library-style (like Lua) >> or does it need new syntax? > > While it might not need new syntax, it does hinge > fundamentally on changing the way the interpreter loop > works, so I don't think a library implementation would > be possible. > You are correct in saying that it would require a change to the interpreter loop, but that does mean it needs new syntax. I still think that a Coroutine class is the cleanest interface. It seems to me that the greenlets approach may have to be the way forward. IMHO the yield-from approach is rather clunky. I think that flattening the interpreter is the most elegant solution, but the C calls presents in all operators and many functions is a killer. Cheers, Mark. From sven at marnach.net Sun Oct 30 19:31:36 2011 From: sven at marnach.net (Sven Marnach) Date: Sun, 30 Oct 2011 18:31:36 +0000 Subject: [Python-ideas] Allow iterable argument to os.walk() In-Reply-To: References: Message-ID: <20111030183136.GC4087@pantoffel-wg.de> Nick Coghlan schrieb am So, 30. Okt 2011, um 09:32:18 +1000: > If people want to walk multiple directories, 3.3 will make that pretty easy: > > def walk_dirs(dirs): > for dir in dirs: > yield from os.walk(dir) In earlier versions you can use itertools.chain.from_iterable(map(os.walk, dirs)) to get an iterator that successively walks through the directory trees rooted at the directories in "dirs". -- Sven From Nikolaus at rath.org Sun Oct 30 21:41:42 2011 From: Nikolaus at rath.org (Nikolaus Rath) Date: Sun, 30 Oct 2011 16:41:42 -0400 Subject: [Python-ideas] PEP 3155 - Qualified name for classes and functions In-Reply-To: <20111030001801.2f52ceb2@pitrou.net> (Antoine Pitrou's message of "Sun, 30 Oct 2011 00:18:01 +0200") References: <20111030001801.2f52ceb2@pitrou.net> Message-ID: <87mxcitdo9.fsf@vostro.rath.org> Antoine Pitrou writes: > This PEP proposes the addition of a ``__qname__`` attribute to functions > and classes. For top-level functions and classes, the ``__qname__`` > attribute is equal to the ``__name__`` attribute. For nested classed, > methods, and nested functions, the ``__qname__`` attribute contains a > dotted path leading to the object from the module top-level. Did you consider making it an array of the actual class/function objects instead of a string? It seems to me that going from the actual objects to the string representation is much easier than having to go the other way around. Best, -Nikolaus -- ?Time flies like an arrow, fruit flies like a Banana.? PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C From jxo6948 at rit.edu Sun Oct 30 21:51:08 2011 From: jxo6948 at rit.edu (John O'Connor) Date: Sun, 30 Oct 2011 16:51:08 -0400 Subject: [Python-ideas] Allow iterable argument to os.walk() In-Reply-To: References: Message-ID: > No, because it means you end up having to special case strings so > they're treated atomically. We already do that in a few > string-specific APIs (e.g. startswith()/endswith()), but it's still > not a particularly nice pattern. It would be a special case, but I?don't?understand why that is a bad thing or why it is appropriate to do in some places but not others. Perhaps?I am missing something. In this case, I think conceptually it makes sense to perform the "walk" on an iterable and also that one would expect strings to be treated atomically. I am well aware that not every 2-3 line recipe needs to be a new function but having more routines operate on iterables (where it makes sense) seems like a good idea. As an aside, I question the aesthetics of yet another for loop for the following. At least python has a nice for loop construct :) for dir in dirs: for root, dirs, files in os.walk(dir): for name in files: .... for name in dirs: .... From solipsis at pitrou.net Sun Oct 30 21:51:14 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 30 Oct 2011 21:51:14 +0100 Subject: [Python-ideas] PEP 3155 - Qualified name for classes and functions References: <20111030001801.2f52ceb2@pitrou.net> <87mxcitdo9.fsf@vostro.rath.org> Message-ID: <20111030215114.43d80761@pitrou.net> On Sun, 30 Oct 2011 16:41:42 -0400 Nikolaus Rath wrote: > Antoine Pitrou writes: > > This PEP proposes the addition of a ``__qname__`` attribute to functions > > and classes. For top-level functions and classes, the ``__qname__`` > > attribute is equal to the ``__name__`` attribute. For nested classed, > > methods, and nested functions, the ``__qname__`` attribute contains a > > dotted path leading to the object from the module top-level. > > Did you consider making it an array of the actual class/function objects > instead of a string? It seems to me that going from the actual objects > to the string representation is much easier than having to go the other > way around. That would add a ton of references and reference cycles, keeping objects alive while they really shouldn't, and making cleanup in the face of __del__ methods much more problematic than it already is. Regards Antoine. From greg.ewing at canterbury.ac.nz Sun Oct 30 22:16:23 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 31 Oct 2011 10:16:23 +1300 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> <4EA9FED3.6050505@pearwood.info> Message-ID: <4EADBEA7.9000608@canterbury.ac.nz> Nick Coghlan wrote: > The fact that greenlets exists (and works) demonstrates that it is > possible to correctly manage and execute multiple Python stacks within > a single OS thread. I'm not convinced that greenlets work correctly under all possible circumstances. As I understand, they work by copying pieces of C stack in and out when switching tasks. That fails if a pointer to local storage is kept anywhere that can be reached by a different task. I seem to remember there being an issue with Tk doing something like this and causing problems for Stackless. -- Greg From greg.ewing at canterbury.ac.nz Sun Oct 30 22:46:07 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 31 Oct 2011 10:46:07 +1300 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> <4EA9FED3.6050505@pearwood.info> Message-ID: <4EADC59F.4030004@canterbury.ac.nz> Another problem with greenlets I've just thought of: What happens if the last reference to the object holding a piece of inactive C stacks dropped? There doesn't seem to be a way of finding any Python references it contains and cleaning them up. -- Greg From ncoghlan at gmail.com Sun Oct 30 22:49:12 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 31 Oct 2011 07:49:12 +1000 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <4EADBEA7.9000608@canterbury.ac.nz> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> <4EA9FED3.6050505@pearwood.info> <4EADBEA7.9000608@canterbury.ac.nz> Message-ID: On Mon, Oct 31, 2011 at 7:16 AM, Greg Ewing wrote: > Nick Coghlan wrote: > >> The fact that greenlets exists (and works) demonstrates that it is >> possible to correctly manage and execute multiple Python stacks within >> a single OS thread. > > I'm not convinced that greenlets work correctly under all > possible circumstances. As I understand, they work by copying > pieces of C stack in and out when switching tasks. That > fails if a pointer to local storage is kept anywhere that > can be reached by a different task. I seem to remember there > being an issue with Tk doing something like this and causing > problems for Stackless. Yeah, that sentence should have had a "mostly" inside the parentheses :) However, I think it may be more fruitful to pursue an approach that uses greenlets as the foundation, and works to close the loopholes (e.g. by figuring out a way for C code to mark itself as "coroutine safe" and assuming it is not safe by default) rather than trying to figure out how to make a "generators all the way down" approach handle invocation of arbitrary slots. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From anacrolix at gmail.com Sun Oct 30 23:12:43 2011 From: anacrolix at gmail.com (Matt Joiner) Date: Mon, 31 Oct 2011 09:12:43 +1100 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> <4EA9FED3.6050505@pearwood.info> <4EADBEA7.9000608@canterbury.ac.nz> Message-ID: +10 for greenlet style coroutines. It's a very modern feature and will put python in an excellent position. Also nick your points about asynchronous io being the main use case are spot on as far as I'm concerned. On Oct 31, 2011 8:49 AM, "Nick Coghlan" wrote: > On Mon, Oct 31, 2011 at 7:16 AM, Greg Ewing > wrote: > > Nick Coghlan wrote: > > > >> The fact that greenlets exists (and works) demonstrates that it is > >> possible to correctly manage and execute multiple Python stacks within > >> a single OS thread. > > > > I'm not convinced that greenlets work correctly under all > > possible circumstances. As I understand, they work by copying > > pieces of C stack in and out when switching tasks. That > > fails if a pointer to local storage is kept anywhere that > > can be reached by a different task. I seem to remember there > > being an issue with Tk doing something like this and causing > > problems for Stackless. > > Yeah, that sentence should have had a "mostly" inside the parentheses :) > > However, I think it may be more fruitful to pursue an approach that > uses greenlets as the foundation, and works to close the loopholes > (e.g. by figuring out a way for C code to mark itself as "coroutine > safe" and assuming it is not safe by default) rather than trying to > figure out how to make a "generators all the way down" approach handle > invocation of arbitrary slots. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mwm at mired.org Sun Oct 30 23:17:49 2011 From: mwm at mired.org (Mike Meyer) Date: Sun, 30 Oct 2011 15:17:49 -0700 Subject: [Python-ideas] Should this be considered a bug? In-Reply-To: References: <79f3fbb5-4b89-4d1d-95cc-ec1286f16fe9@email.android.com> Message-ID: <20111030151749.554298a0@bhuda.mired.org> On Sun, 30 Oct 2011 20:55:29 +1000 Nick Coghlan wrote: > On Sun, Oct 30, 2011 at 8:29 PM, Mike Meyer wrote: > > h = [1, 2, 3] > > d = dict(a=1, b=2) > > h += d # works > > h = h + d # exception > No, the reason the latter fails is because it's unclear what the type > of the result should be and the interpreter refuses to guess. In the > former case, there's no such ambiguity (since the type of 'h' simply > stays the same). While that's perfectly reasonable, it implies that some result type other than list is possible. I can't see what that would be since you can't add dictionaries. The only other iterable type you can add is tuples - and they don't display this behavior. So why not go ahead and fix this "bug" by making list.__add__ try converting the other object to a list, and adding a list.__radd__ with the same behavior? For that matter, a similar tweak for tuples wouldn't be inappropriate, and then the case of adding tuples & lists can raise a type error, paralleling the behavior of decimals and floats in the number hierarchy. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From ncoghlan at gmail.com Sun Oct 30 23:31:18 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 31 Oct 2011 08:31:18 +1000 Subject: [Python-ideas] Should this be considered a bug? In-Reply-To: <20111030151749.554298a0@bhuda.mired.org> References: <79f3fbb5-4b89-4d1d-95cc-ec1286f16fe9@email.android.com> <20111030151749.554298a0@bhuda.mired.org> Message-ID: On Mon, Oct 31, 2011 at 8:17 AM, Mike Meyer wrote: > The only other iterable type you can add is tuples - and they don't > display this behavior. So why not go ahead and fix this "bug" by > making list.__add__ try converting the other object to a list, and > adding a list.__radd__ with the same behavior? > > For that matter, a similar tweak for tuples wouldn't be inappropriate, > and then the case of adding tuples & lists can raise a type error, > paralleling the behavior of decimals and floats in the number > hierarchy. The universe of container types is a lot bigger than the Python builtins. Suppose we made list.__add__ as permissive as list.__iadd__. Further suppose we make collections.deque.__add__ similarly permissive. Now "[] + deque() == []", but "deque() + [] == deque()". Contrast that with numeric types, which are designed to work together as a numeric tower, such that the type of a binary operation's result does *not* depend on the order of the operands: "1 + 1.1 == 2.1" and "1.1 + 1 == 2.1". Commutativity is an important expected property for addition and multiplication. Even NumPy's arrays respect that by using those operations for the element-wise equivalents - they use a separate method for matrix multiplication (which is not commutative by definition). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun Oct 30 23:40:01 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 31 Oct 2011 08:40:01 +1000 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> <4EA9FED3.6050505@pearwood.info> <4EADBEA7.9000608@canterbury.ac.nz> Message-ID: On Mon, Oct 31, 2011 at 8:12 AM, Matt Joiner wrote: > +10 for greenlet style coroutines. It's a very modern feature and will put > python in an excellent position. Also nick your points about asynchronous io > being the main use case are spot on as far as I'm concerned. This whole thread has been very helpful to me in understanding not only why I think Greg's coroutine PEP is important, but also why the whole generators-as-coroutines paradigm feels so restrictive. It means we basically have two potential paths forward: 1. Stackless Python (aka greenlets). Pros: known to work for a large number of use cases Cons: need to define a mechanism to declare that the C stack is being used in a "coroutine friendly" fashion 2. Implicit creation of generator-style suspendable frames Pros: shouldn't need assembly level hackery Cons: effectively requires duplication of every C level type slot with a coroutine friendly equivalent that supports suspension Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From arnodel at gmail.com Sun Oct 30 23:40:05 2011 From: arnodel at gmail.com (Arnaud Delobelle) Date: Sun, 30 Oct 2011 22:40:05 +0000 Subject: [Python-ideas] Should this be considered a bug? In-Reply-To: References: <79f3fbb5-4b89-4d1d-95cc-ec1286f16fe9@email.android.com> <20111030151749.554298a0@bhuda.mired.org> Message-ID: On 30 October 2011 22:31, Nick Coghlan wrote: > Commutativity is an important expected property for addition and > multiplication. Even NumPy's arrays respect that by using those > operations for the element-wise equivalents - they use a separate > method for matrix multiplication (which is not commutative by > definition). The convention in mathematics is that addition is commutative but there is no such assumption for multiplication. In fact, multiplicative notation is commonly used for the composition law of non-commutative groups. Is there an established convention in computer languages? As for the commutativity of addition in python... well, string concatenation is not generally considered commutative :) -- Arnaud From ncoghlan at gmail.com Sun Oct 30 23:49:54 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 31 Oct 2011 08:49:54 +1000 Subject: [Python-ideas] Should this be considered a bug? In-Reply-To: References: <79f3fbb5-4b89-4d1d-95cc-ec1286f16fe9@email.android.com> <20111030151749.554298a0@bhuda.mired.org> Message-ID: On Mon, Oct 31, 2011 at 8:40 AM, Arnaud Delobelle wrote: > The convention in mathematics is that addition is commutative but > there is no such assumption for multiplication. ?In fact, > multiplicative notation is commonly used for the composition law of > non-commutative groups. > > Is there an established convention in computer languages? I was referring to the definition of multiplication in ordinary arithmetic (including complex numbers). Agreed that mathematicians in general are quite happy to use it for other relations that don't align with that definition. > As for the commutativity of addition in python... well, string > concatenation is not generally considered commutative :) Yeah, I was really talking about commutativity of result types rather than result values. The point about order mattering for values as soon as sequences get involved is a fair one (IIRC, that's one of the reasons set union uses '|' rather '+' - as a hint that the operation *is* commutative, even though container addition normally isn't). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From jimjjewett at gmail.com Sun Oct 30 23:57:57 2011 From: jimjjewett at gmail.com (Jim Jewett) Date: Sun, 30 Oct 2011 18:57:57 -0400 Subject: [Python-ideas] Draft PEP for the regularization of Python install layouts In-Reply-To: References: Message-ID: On Fri, Oct 28, 2011 at 5:53 PM, VanL wrote: > In part inspired by the virtualenv-in-the-stdlib PEP, I figured that it > might be a good time to draft up a PEP to fix one of my regular annoyances: > the ever-so-slightly different layouts for Python between platforms. Is this something that Python even *can* reasonably control, particularly on the various Linux distributions? What might be helpful would be a few more symbolics (if any are actually missing) and a few test environments that use something unexpected for each value, so that you *will* notice if you have hardcoded assumptions specific to your own setup. -jJ From mwm at mired.org Mon Oct 31 00:15:15 2011 From: mwm at mired.org (Mike Meyer) Date: Sun, 30 Oct 2011 16:15:15 -0700 Subject: [Python-ideas] Should this be considered a bug? In-Reply-To: References: <79f3fbb5-4b89-4d1d-95cc-ec1286f16fe9@email.android.com> <20111030151749.554298a0@bhuda.mired.org> Message-ID: <20111030161515.786a9f3c@bhuda.mired.org> On Mon, 31 Oct 2011 08:31:18 +1000 Nick Coghlan wrote: > On Mon, Oct 31, 2011 at 8:17 AM, Mike Meyer wrote: > > The only other iterable type you can add is tuples - and they don't > > display this behavior. So why not go ahead and fix this "bug" by > > making list.__add__ try converting the other object to a list, and > > adding a list.__radd__ with the same behavior? > > > > For that matter, a similar tweak for tuples wouldn't be inappropriate, > > and then the case of adding tuples & lists can raise a type error, > > paralleling the behavior of decimals and floats in the number > > hierarchy. > > The universe of container types is a lot bigger than the Python > builtins. Suppose we made list.__add__ as permissive as list.__iadd__. > Further suppose we make collections.deque.__add__ similarly > permissive. > > Now "[] + deque() == []", but "deque() + [] == deque()". Contrast that > with numeric types, which are designed to work together as a numeric > tower, such that the type of a binary operation's result does *not* > depend on the order of the operands: "1 + 1.1 == 2.1" and "1.1 + 1 == > 2.1". True. But addition on lists isn't commutative. [1] + [2] != [2] + [1]. > Commutativity is an important expected property for addition and > multiplication. Even NumPy's arrays respect that by using those > operations for the element-wise equivalents - they use a separate > method for matrix multiplication (which is not commutative by > definition). I agree. And if lists were commutative under addition, I'd say that was sufficient reason not to do this. But they aren't. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From ncoghlan at gmail.com Mon Oct 31 00:46:42 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 31 Oct 2011 09:46:42 +1000 Subject: [Python-ideas] Should this be considered a bug? In-Reply-To: <20111030161515.786a9f3c@bhuda.mired.org> References: <79f3fbb5-4b89-4d1d-95cc-ec1286f16fe9@email.android.com> <20111030151749.554298a0@bhuda.mired.org> <20111030161515.786a9f3c@bhuda.mired.org> Message-ID: On Mon, Oct 31, 2011 at 9:15 AM, Mike Meyer wrote: >> Commutativity is an important expected property for addition and >> multiplication. Even NumPy's arrays respect that by using those >> operations for the element-wise equivalents - they use a separate >> method for matrix multiplication (which is not commutative by >> definition). > > I agree. And if lists were commutative under addition, I'd say that > was sufficient reason not to do this. But they aren't. Yeah, I was thinking in terms of type inference, but writing in terms of values. I should have stuck with my original examples, which showed the type inference rather than using specific instances. Basically, the design principle at work here is "the type of the result of a binary operation should not depend on the order of the operands". It has its roots in commutativity, but is not commutativity as such (since that is about values, not types). This principle drives lots of aspects of Python's implicit type conversion design. For example: 1. The left hand operand is typically given the first go at handling the operation, but is expected to return NotImplemented if it doesn't recognise the other operand 2. The right hand operand is given first go in the case where it is an instance of a *subclass* of the type of the left hand operand, thus allowing subclasses to consistently override the behaviour of the parent class 3. Most types (with the notable exception of numeric types) will only permit binary operations with other instances of the same type. For numeric types, the coercion is arranged so that the type of the result remains independent of the order of the operands. I did find an unfortunate case where Python 3 violates this design principle - bytes and bytearray objects accept any object that implements the buffer interface as the other operand, even in the __add__ case. This leads to the following asymmetries: >>> b'' + bytearray() b'' >>> b'' + memoryview(b'') b'' >>> bytearray() + b'' bytearray(b'') >>> bytearray() + memoryview(b'') bytearray(b'') >>> memoryview(b'') + b'' Traceback (most recent call last): File "", line 1, in TypeError: unsupported operand type(s) for +: 'memoryview' and 'bytes' >>> memoryview(b'') + bytearray(b'') Traceback (most recent call last): File "", line 1, in TypeError: unsupported operand type(s) for +: 'memoryview' and 'bytearray' Now, the latter two cases are due to the problem I mentioned earlier where returning NotImplemented from sq_concat or sq_repeat doesn't work properly, but the bytes and bytearray interaction is exactly the kind of type asymmetry this guideline is intended to prevent. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From anacrolix at gmail.com Mon Oct 31 01:06:08 2011 From: anacrolix at gmail.com (Matt Joiner) Date: Mon, 31 Oct 2011 11:06:08 +1100 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> <4EA9FED3.6050505@pearwood.info> <4EADBEA7.9000608@canterbury.ac.nz> Message-ID: This is excellent. A lot of languages are opening up new methods for concurrency, Python should make sure to participate on this. In particular Haskell's lightweight threads, "sparks", and golang's "goroutines" are of this form and also provide builtin asynchronous IO. I think a feature like this requires some standardization (in the form of the standard library) in order that all third-parties can cooperate on the same implementation. I'm not sure that option 2 that Nick provides plays nice with the C compatibility of the CPython implementation. I've had a lot of success with the greenlet model, it's quite trivial to wrap it up to implicitly spawn an IO loop under the covers. The downside is that all the client code needs to be adjusted to defer blocking calls to the loop, or the entire standard library must be hooked. Again, this doesn't play well with the C stuff: Native C modules, or any third party calls that don't expect to be part of the greenlet model will block entire threads. I'd love to see suggestions that enable existing C code to function as expected, otherwise an opt-in system (which is how my own implementations operate). Again, if some coroutine stuff was baked into the standard library, that would enable third-parties to reliably write modules that could rely on support for coroutines being available. I find Greg's coroutine PEP confusing, and I don't see why an existing reliable model can't be used (looking at you greenlets). On Mon, Oct 31, 2011 at 9:40 AM, Nick Coghlan wrote: > On Mon, Oct 31, 2011 at 8:12 AM, Matt Joiner wrote: >> +10 for greenlet style coroutines. It's a very modern feature and will put >> python in an excellent position. Also nick your points about asynchronous io >> being the main use case are spot on as far as I'm concerned. > > This whole thread has been very helpful to me in understanding not > only why I think Greg's coroutine PEP is important, but also why the > whole generators-as-coroutines paradigm feels so restrictive. > > It means we basically have two potential paths forward: > > 1. Stackless Python (aka greenlets). > ? Pros: known to work for a large number of use cases > ? Cons: need to define a mechanism to declare that the C stack is > being used in a "coroutine friendly" fashion > > 2. Implicit creation of generator-style suspendable frames > ? Pros: shouldn't need assembly level hackery > ? Cons: effectively requires duplication of every C level type slot > with a coroutine friendly equivalent that supports suspension > > Cheers, > Nick. > > -- > Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia > From ncoghlan at gmail.com Mon Oct 31 01:20:35 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 31 Oct 2011 10:20:35 +1000 Subject: [Python-ideas] Should this be considered a bug? In-Reply-To: References: <79f3fbb5-4b89-4d1d-95cc-ec1286f16fe9@email.android.com> <20111030151749.554298a0@bhuda.mired.org> <20111030161515.786a9f3c@bhuda.mired.org> Message-ID: On Mon, Oct 31, 2011 at 9:46 AM, Nick Coghlan wrote: > I did find an unfortunate case where Python 3 violates this design > principle - bytes and bytearray objects accept any object that > implements the buffer interface as the other operand, even in the > __add__ case. Interestingly, I found a similar ordering dependency exists for set() and frozenset(): >>> set() | frozenset() set() >>> frozenset() | set() frozenset() So the status quo is less consistent than I previously thought. Regardless, it isn't enough to merely point out that the status quo is inconsistent. It's necessary to make a case for why the proposed change is sufficiently better to be worth the cost of making the change: http://www.boredomandlaziness.org/2011/02/status-quo-wins-stalemate.html Consider the "it's not worth the hassle" stodgy core developer response invoked ;) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From guido at python.org Mon Oct 31 02:44:40 2011 From: guido at python.org (Guido van Rossum) Date: Sun, 30 Oct 2011 18:44:40 -0700 Subject: [Python-ideas] Allow iterable argument to os.walk() In-Reply-To: References: Message-ID: On Sun, Oct 30, 2011 at 1:51 PM, John O'Connor wrote: >> No, because it means you end up having to special case strings so >> they're treated atomically. We already do that in a few >> string-specific APIs (e.g. startswith()/endswith()), but it's still >> not a particularly nice pattern. > > It would be a special case, but I?don't?understand why that is a bad > thing or why it is appropriate to do in some places but not others. Maybe you forgot what the Zen of Python says about special cases. :-) "Special cases aren't special enough to break the rules." > Perhaps?I am missing something. In this case, I think conceptually it > makes sense to perform the "walk" on an iterable and also that one > would expect strings to be treated atomically. I am well aware that > not every 2-3 line recipe needs to be a new function but having more > routines operate on iterables (where it makes sense) seems like a good > idea. It is often a good idea when the routine currently takes a specific iterable type (e.g. list). It is rarely a good idea to have a function that acts either on something of type X or a list of things of type X. > As an aside, I question the aesthetics of yet another for loop for the > following. At least python has a nice for loop construct :) > > for dir in dirs: > ? ?for root, dirs, files in os.walk(dir): > ? ? ? ?for name in files: > ? ? ? ? ? ?.... > ? ? ? ?for name in dirs: > ? ? ? ? ? ?.... Whatever. :) -- --Guido van Rossum (python.org/~guido) From mwm at mired.org Mon Oct 31 04:08:57 2011 From: mwm at mired.org (Mike Meyer) Date: Sun, 30 Oct 2011 20:08:57 -0700 Subject: [Python-ideas] Should this be considered a bug? In-Reply-To: References: <79f3fbb5-4b89-4d1d-95cc-ec1286f16fe9@email.android.com> <20111030151749.554298a0@bhuda.mired.org> <20111030161515.786a9f3c@bhuda.mired.org> Message-ID: <20111030200857.259b9236@bhuda.mired.org> On Mon, 31 Oct 2011 10:20:35 +1000 Nick Coghlan wrote: > On Mon, Oct 31, 2011 at 9:46 AM, Nick Coghlan wrote: > So the status quo is less consistent than I previously thought. > > Regardless, it isn't enough to merely point out that the status quo is > inconsistent. It's necessary to make a case for why the proposed > change is sufficiently better to be worth the cost of making the > change: http://www.boredomandlaziness.org/2011/02/status-quo-wins-stalemate.html > > Consider the "it's not worth the hassle" stodgy core developer > response invoked ;) I'll buy both arguments. Hopefully, someone is keeping a list of those inconsistencies (set vs. frozenset, etc.) to fix them at some point in the future. Thanks, http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From mwm at mired.org Mon Oct 31 04:11:43 2011 From: mwm at mired.org (Mike Meyer) Date: Sun, 30 Oct 2011 20:11:43 -0700 Subject: [Python-ideas] Concurrent safety? Message-ID: <20111030201143.481fdca2@bhuda.mired.org> This is a blue sky idea. It can't happen in Python 3.x, and possibly not ever in cPython. I'm mostly hoping to get smarter people than me considering the issue. Synopsis Python, as a general rule, tries to be "safe" about things. If something isn't obviously correct, it tends to throw runtime errors to let you know that you need to be explicit about what you want. When there's not an obvious choice for type conversions, it raises an exception. You generally don't have to worry about resource allocation if you stay in python. And so on. The one glaring exception is in concurrent programs. While the tools python has for dealing with such are ok, there isn't anything to warn you when you fail to use those tools and should be. The goal of this proposal is to fix that, and get the Python interpreter to help locate code that isn't safe to use in concurrent programs. Existence Proof This is possible. Clojure is a dynamic language in the LISP family that will throw exceptions if you try mutating variables without properly protecting them against concurrent access. This is not to say that the Clojure solution is the solution, or even the right solution for Python. It's just to demonstrate that this can be done. Object Changes Object semantics don't need to change very much. The existing immutable types will work well in this environment exactly as is. The mutable types - well, we can no longer go changing them willy-nilly. But any language needs mutable types, and there's nothing wrong with the ones we have. Since immutable types don't require access protection *at all*, it might be worthwhile to add a new child of object, "immutable". Instances of this type would be immutable after creation. Presumably, the __new__ method of Python classes inheriting from immutable would be used to set the initial attributes, but the __init__ method might also be able to handle that role. However, this is a performance tweak, allowing user-written classes to skip any runtime checks for being mutated. Binding Changes One of the way objects are mutated is by changing their bindings. As such, some of the bindings might need to be protected. Local variables are fine. We normally can't export those bindings to other functions, just the values bound to them. So changing the binding can stay the same. The bound object can be exported to other threads of execution, but changing it will fall under the rules for changing objects. Ditto for nonlocals. On the other hand, rebindings of module and class and instance variables can be visible in other threads of execution, so they require protection, just like changing mutable objects. New Syntax The protection mechanism is the change to the language. I propose a single new keyword, "locking", that acts similar to the "try" keyword. The syntax is: 'locking' value [',' value]*':' suite The list of values are the objects that can be mutated in this lock. An immutable object showing up in the list of values is a TypeError. It's not clear that function calls should be allowed in the list of values. On the other hand, indexing and attributes clearly should be, and those can turn into function calls, so it's not clear they shouldn't be allowed, either. The locked values can be mutated during the body of the locking suite. For the builtin mutable types, this means invoking their mutating methods. For modules, classes and object instances, it means rebinding their attributes. Locked objects stay locked during function invocations in the suite. This means you can write utility functions that expect to be passed locked objects to mutate. A locking statement can be used inside of another locking statement. See the Implementation section for possible restrictions on this. Any attempt to mutate an object that isn't currently locked will raise an exception. Possibly ValueError, possibly a new exception class just for this purpose. This includes rebinding attributes of objects that aren't locked. Implementation There are at least two ways this can be implemented, both with different restrictions on the suite. While both of them can probably be optimized if it's know that there are no other threads of execution, checking for attempts to mutate unlocked objects should still happen. 1) Conventional locking All the objects being locked have locks attached to them, which are locked when before entering the suite. The implementation must order the locked object in some repeatable way, so that two locking statements that have more than one locked object in common will obtain the locks on those objects in the same order. This will prevent deadlocks. This method will require that the initial locking statement lock all objects that may be locked during the execution of it's suite. This may be a reason for allowing functions as locking values, as a way to get locks on objects that code called in the suite is going to need. Another downside is that the programmer needs to handle exceptions raised during the suite to insure that a set of related changes leaves the relevant objects in a consistent state. In this case, an optional 'except' clause should be added to the locking statement to hold such code. 2) Software Transactional Memory In an STM implementation, copies of the locked objects are created by the locking statement, and they original are "fingerprinted" in some way. The locking suite then runs. When the suite completes, the fingerprints of the originals are checked to see if some other thread of execution has changed them. If they haven't changed, they are replaced by the copies, and execution continues. If the originals have changed, the entire process starts over. In this implementation, the only actual locking is during the original fingerprinting process (to insure that a consistent state is captured) and at the end of the suite. FWIW, this is one of the models provided by Clojure. The restriction on the suite in this case is that running it twice - except for changes to the locked objects - needs to be acceptable. In this case, exceptions don't need to be handled by the programmer to insure consistency. If an exception happens during the execution of the suite, the original values are never replaced. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From ncoghlan at gmail.com Mon Oct 31 04:21:33 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 31 Oct 2011 13:21:33 +1000 Subject: [Python-ideas] Concurrent safety? In-Reply-To: <20111030201143.481fdca2@bhuda.mired.org> References: <20111030201143.481fdca2@bhuda.mired.org> Message-ID: On Mon, Oct 31, 2011 at 1:11 PM, Mike Meyer wrote: > The one glaring exception is in concurrent programs. While the tools > python has for dealing with such are ok, there isn't anything to warn > you when you fail to use those tools and should be. This will basically run into the same problem that free-threading-in-CPython concepts do - the fine grained checks you need to implement it will kill your single-threaded performance. Since Python is a scripting language that sees heavy single-threaded use, that's not an acceptable trade-off. Software transactional memory does offer some hope for a more reasonable alternative, but that has its own problems (mainly I/O related). It will be interesting to see how PyPy's experiments in this space pan out. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From greg.ewing at canterbury.ac.nz Mon Oct 31 07:54:32 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 31 Oct 2011 19:54:32 +1300 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> <4EA9FED3.6050505@pearwood.info> <4EADBEA7.9000608@canterbury.ac.nz> Message-ID: <4EAE4628.7090604@canterbury.ac.nz> Nick Coghlan wrote: > It means we basically have two potential paths forward: > > 1. Stackless Python (aka greenlets). > Pros: known to work for a large number of use cases > Cons: need to define a mechanism to declare that the C stack is > being used in a "coroutine friendly" fashion > > 2. Implicit creation of generator-style suspendable frames > Pros: shouldn't need assembly level hackery > Cons: effectively requires duplication of every C level type slot > with a coroutine friendly equivalent that supports suspension Or maybe there's a third way: 3. Recognise that, today, some people *are* using generators to do coroutine-like programming and finding it a useful technique, despite all of its limitations, and consider ways to make it easier, even if we can't remove all of those limitations. It seems to me that the inability to suspend from within a special method is not likely to be a problem very often in practice. Inability to use generators as generators in a coroutine environment may be more serious, since it's quite likely people will want to e.g. use a for-loop to iterate over lines being read from a socket, and suspend while waiting for data to arrive. I think that particular problem could be solved, given some more thought. The question is, would it be worth the effort? -- Greg From greg.ewing at canterbury.ac.nz Mon Oct 31 08:08:40 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 31 Oct 2011 20:08:40 +1300 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> <4EA9FED3.6050505@pearwood.info> <4EADBEA7.9000608@canterbury.ac.nz> Message-ID: <4EAE4978.5000900@canterbury.ac.nz> Matt Joiner wrote: > I've had a lot of success > with the greenlet model, it's quite trivial to wrap it up to > implicitly spawn an IO loop under the covers. The downside is that all > the client code needs to be adjusted to defer blocking calls to the > loop, or the entire standard library must be hooked. Again, this > doesn't play well with the C stuff: Native C modules, or any third > party calls that don't expect to be part of the greenlet model will > block entire threads. That seems to be true of *any* approach, short of using a hacked libc that replaces all the system calls. -- Greg From ncoghlan at gmail.com Mon Oct 31 09:15:43 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 31 Oct 2011 18:15:43 +1000 Subject: [Python-ideas] Cofunctions - Back to Basics In-Reply-To: <4EAE4628.7090604@canterbury.ac.nz> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> <4EA9FED3.6050505@pearwood.info> <4EADBEA7.9000608@canterbury.ac.nz> <4EAE4628.7090604@canterbury.ac.nz> Message-ID: On Mon, Oct 31, 2011 at 4:54 PM, Greg Ewing wrote: > Or maybe there's a third way: > > 3. Recognise that, today, some people *are* using generators > ? to do coroutine-like programming and finding it a useful > ? technique, despite all of its limitations, and consider > ? ways to make it easier, even if we can't remove all of > ? those limitations. > > It seems to me that the inability to suspend from within a > special method is not likely to be a problem very often in > practice. Doing blocking I/O in special methods (other than __next__) is generally a dubious practice anyway, so you're right that disallowing suspension when such a function is on the stack is unlikely to be a major problem. > Inability to use generators as generators in a coroutine > environment may be more serious, since it's quite likely > people will want to e.g. use a for-loop to iterate over > lines being read from a socket, and suspend while waiting > for data to arrive. > > I think that particular problem could be solved, given > some more thought. The question is, would it be worth the > effort? Yes, I think that one needs to be solved. The issue of providing coroutine-friendly iterative builtins and itertools functionality also need to be addressed. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From greg.ewing at canterbury.ac.nz Mon Oct 31 09:42:43 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 31 Oct 2011 21:42:43 +1300 Subject: [Python-ideas] Cofunctions - Getting away from the iterator protocol In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> <4EA9FED3.6050505@pearwood.info> <4EADBEA7.9000608@canterbury.ac.nz> Message-ID: <4EAE5F83.9040305@canterbury.ac.nz> Thinking about how to support cofunctions iterating over generators that are themselves cofunctions and therefore suspendable, I've come to realise that cofunctionness and generatorness really need to be orthogonal concepts. As well as functions and cofunctions, we need generators and cogenerators. And thinking about how to allow *that* while building on the yield-from mechanism, I found myself inventing what amounts to a complete parallel implementation of the iterator protocol, in which cogenerators create coiterators having a __conext__ method that raises CoStopIteration when finished, etc, etc... :-\ At which point I thought, well, why not forget about using the iterator protocol as a basis altogether, and design a new protocol specifically designed for the purpose? About then I also remembered a thought I had in passing earlier, when Nick was talking about the fact that, when using yield-from, there is no object available that can hold a stack of suspended generators, so you end up traversing an ever-lengthening chain of generator frames as the call stack gets deeper. However, with cofunctions there *is* a place where we could create such an object -- it could be done by costart(). We just need to allow it to be passed to the places where it's needed, and with a brand-new protocol we have the chance to do that. I'll give a sketch of what such a protocol might be like in my next post. -- Greg From jimjjewett at gmail.com Mon Oct 31 15:17:14 2011 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 31 Oct 2011 10:17:14 -0400 Subject: [Python-ideas] PEP 3155 - Qualified name for classes and functions In-Reply-To: <20111030001801.2f52ceb2@pitrou.net> References: <20111030001801.2f52ceb2@pitrou.net> Message-ID: How meaningful are the extra two slots for every function or class object? Have you done benchmarks like the Unicode Changes PEP has? -jJ From amauryfa at gmail.com Mon Oct 31 15:33:35 2011 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Mon, 31 Oct 2011 15:33:35 +0100 Subject: [Python-ideas] Concurrent safety? In-Reply-To: <20111030201143.481fdca2@bhuda.mired.org> References: <20111030201143.481fdca2@bhuda.mired.org> Message-ID: Hi, 2011/10/31 Mike Meyer > Any attempt to mutate an object that isn't currently locked will raise > an exception. Possibly ValueError, possibly a new exception class just > for this purpose. This includes rebinding attributes of objects that > aren't locked. > PyPy offers a nice platform to play with this kind of concepts. For example, even if it's not the best implementation, it's easy to add a __setattr__ to the base W_Object class, which will check whether the object is allowed to mutate. But you certainly will open a can of worms here: even immutable objects are modified (e.g str.__hash__ is cached) and many functions that you call will need to add their own locks, is it possible to avoid deadlocks in this case? -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Mon Oct 31 15:33:20 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 31 Oct 2011 15:33:20 +0100 Subject: [Python-ideas] PEP 3155 - Qualified name for classes and functions References: <20111030001801.2f52ceb2@pitrou.net> Message-ID: <20111031153320.1e69dc41@pitrou.net> On Mon, 31 Oct 2011 10:17:14 -0400 Jim Jewett wrote: > How meaningful are the extra two slots for every function or class object? It's only one extra slot per function or class. It represents a 6% increase for functions, and a 1% increase for classes (not counting the space taken by the __qname__ string itself, but it will be usually be shared amongst many objects). > Have you done benchmarks like the Unicode Changes PEP has? I don't expect it to have any interesting impact. What benchmarks do you have any mind? Regards Antoine. From dirkjan at ochtman.nl Mon Oct 31 16:42:00 2011 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Mon, 31 Oct 2011 16:42:00 +0100 Subject: [Python-ideas] PEP 3155 - Qualified name for classes and functions In-Reply-To: <20111030001801.2f52ceb2@pitrou.net> References: <20111030001801.2f52ceb2@pitrou.net> Message-ID: On Sun, Oct 30, 2011 at 00:18, Antoine Pitrou wrote: > I would like to propose the following PEP for discussion and, if > possible, acceptance. I think the proposal shouldn't be too > controversial (I find it quite simple and straightforward myself :-)). Are these names relative or fully absolute? I.e. I've had problems in the past with unpickling objects that were pickled from a module that was imported using a relative import. Would it be possible to define the qname such that the full path to the name, starting from a sys.path level down, is always used? Cheers, Dirkjan From solipsis at pitrou.net Mon Oct 31 16:50:03 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 31 Oct 2011 16:50:03 +0100 Subject: [Python-ideas] PEP 3155 - Qualified name for classes and functions References: <20111030001801.2f52ceb2@pitrou.net> Message-ID: <20111031165003.7cb104d3@pitrou.net> On Mon, 31 Oct 2011 16:42:00 +0100 Dirkjan Ochtman wrote: > On Sun, Oct 30, 2011 at 00:18, Antoine Pitrou wrote: > > I would like to propose the following PEP for discussion and, if > > possible, acceptance. I think the proposal shouldn't be too > > controversial (I find it quite simple and straightforward myself :-)). > > Are these names relative or fully absolute? I.e. I've had problems in > the past with unpickling objects that were pickled from a module that > was imported using a relative import. Would it be possible to define > the qname such that the full path to the name, starting from a > sys.path level down, is always used? The __qname__, by design, doesn't include any module name. To get the "full path", you still have to add in the __module__ attribute. Solving the problems with relative imports (I'm not sure what they are) is another problem which Nick is apparently tackling (?). Regards Antoine. From luoyonggang at gmail.com Mon Oct 31 16:57:15 2011 From: luoyonggang at gmail.com (=?UTF-8?B?572X5YuH5YiaKFlvbmdnYW5nIEx1bykg?=) Date: Mon, 31 Oct 2011 23:57:15 +0800 Subject: [Python-ideas] When I use Python under Windows. I found some file handles are not closed, Message-ID: How did detecting where those handlers are created to tracing it and close it. Mainly because I was using C binding library(subvertpy) and file is not closed. -- ?? ? ??? Yours sincerely, Yonggang Luo -------------- next part -------------- An HTML attachment was scrubbed... URL: From merwok at netwok.org Mon Oct 31 18:02:55 2011 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Mon, 31 Oct 2011 18:02:55 +0100 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> <4EAAD51A.9030608@netwok.org> Message-ID: <4EAED4BF.9050909@netwok.org> Hi, Le 28/10/2011 19:51, Guido van Rossum a ?crit : > On Fri, Oct 28, 2011 at 9:15 AM, ?ric Araujo wrote: >> Hm. Sometimes we want the class name, sometimes module.class, so even >> with the change we won?t always be able to use str(cls). > It is a well-known fact of humanity that you can't please anyone. > There's not that much data on how often the full name is better; my > hunch however is that most of the time the class name is sufficiently > unique within the universe of classes that could be printed, and > showing the module name just feels pedantic. Apps that know just the > name is sub-optimal should stick to rendering using cls.__module__ and > cls.__name__. Fair enough. >> The output of repr and str is not (TTBOMK) exactly defined or >> guaranteed; nonetheless, I expect that many people (including me) rely >> on some conversions (like the fact that repr('somestr') includes >> quotes). So we can change str(cls) and say that *now* it has defined >> output, or leave it alone to avoid breaking code that does depend on the >> output, which can be seen as a wrong thing or a pragmatic thing (?I need >> it and it works?). > In my view, str() and repr() are both for human consumption (though in > somewhat different contexts). If tweaking them helps humans understand > the output better then let's tweak them. If you as a developer feel > particularly anal about how you want your object printed, you should > avoid repr() or str() and write your own formatting function. > > If as a programmer you feel the urge to go parse the output of repr() > or str(), you should always *know* that a future version of Python can > break your code, and you should file a feature request to have an API > added to the class so you won't have to parse the repr() or str(). Okay! I will update the patch to change str(func) and str(module). As it?s a debugging aid meant for human, I won?t update the doc or stdlib to recommend using str(x) instead of x.__name__ and everyone should be happy (or complain before the final release). Cheers From ethan at stoneleaf.us Mon Oct 31 18:10:28 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 31 Oct 2011 10:10:28 -0700 Subject: [Python-ideas] Changing str(someclass) to return only the class name In-Reply-To: References: <4EA18598.9060602@netwok.org> <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> <4EAAD51A.9030608@netwok.org> Message-ID: <4EAED684.1010707@stoneleaf.us> Guido van Rossum wrote: > In my view, str() and repr() are both for human consumption I was under the impression that repr() was for eval consumption (when possible). ~Ethan~ From ron3200 at gmail.com Mon Oct 31 18:57:30 2011 From: ron3200 at gmail.com (Ron Adam) Date: Mon, 31 Oct 2011 12:57:30 -0500 Subject: [Python-ideas] Cofunctions - Getting away from the iterator protocol In-Reply-To: <4EAE5F83.9040305@canterbury.ac.nz> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> <4EA9FED3.6050505@pearwood.info> <4EADBEA7.9000608@canterbury.ac.nz> <4EAE5F83.9040305@canterbury.ac.nz> Message-ID: <1320083850.5984.115.camel@Gutsy> On Mon, 2011-10-31 at 21:42 +1300, Greg Ewing wrote: > Thinking about how to support cofunctions iterating over > generators that are themselves cofunctions and therefore > suspendable, I've come to realise that cofunctionness > and generatorness really need to be orthogonal concepts. > As well as functions and cofunctions, we need generators > and cogenerators. > > And thinking about how to allow *that* while building > on the yield-from mechanism, I found myself inventing > what amounts to a complete parallel implementation of > the iterator protocol, in which cogenerators create > coiterators having a __conext__ method that raises > CoStopIteration when finished, etc, etc... :-\ > > At which point I thought, well, why not forget about > using the iterator protocol as a basis altogether, and > design a new protocol specifically designed for the > purpose? > > About then I also remembered a thought I had in passing > earlier, when Nick was talking about the fact that, > when using yield-from, there is no object available that > can hold a stack of suspended generators, so you end > up traversing an ever-lengthening chain of generator > frames as the call stack gets deeper. I keep coming back to resumable exceptions as a way to suspend and resume, separate from the yield data path. Looking up previous discussions about it, there have been requests for them, but for the most part, those specific uses were better handled in other ways. There is also the problem that at a low level, they can be non-determinant as to where exactly the exception actually occurred. These negatives have pretty much killed any further discussion. If we try to make all exceptions resumable, (either at the line they occurred or the line after), there are a lot of questions as to just how to make it work in a nice way. Enough so, it isn't worth doing. But I think there is a way to look at them in a positive light. If we put some strict requirements on the idea. 1. Only have a *SINGLE* exception type as being resumable. 2. That exception should *NEVER* occur naturally. 3. Only allow continuing after it's *EXPLICITLY RAISED* by a raised statement. All of the problem issues go away with those requirements in place, and you only have the issue of how to actually write the patch. Earlier discussions indicated, it might not be that hard to do. Just like ValueError, a ResumableException would/should *never* occur naturally, so there is no issue about where it happened. So as long as they are limited to a single exception type, it may be doable in a clean way. Also they could be sub-classed so in effect offer a multi-channel way to easily control co-routines. The problems with existing designs... I've been playing around with co-function pipes where the data is pulled through. Once you figure out how they work they are fairly easy to design as they are just generators that are linked together in a chain. At the source end is of course some sort of data source or generator, and each link operates on items as they a pulled by a next() call on the other end. In a single pipe design, the last item is also the consumer and pulls things through the pipe as it needs it. But if you need to regulate the timing or run more than one pipe at a time, it requires adding an additional controller generator at the end that serves as part of the framework. The scheduler then does a next() call on the end generator, which causes an item to be pulled through the pipe, The consumer next to the end must push the data somewhere in that case. This type of design has a major problem as the speed the pipe works is determined by how long it takes for data to go through it. Thats not good if we are trying to run many pipes at once. Sense they can only suspend at yields, it requires sending scheduling data through the pipe along with (or instead of) the data and sort that back out at some point. Once we do that, our general purpose generators become tied to the framework. The dual iterator protocol is one way around that. A trampoline can handle this because it sits between every co-function, so it can check the data for signal objects or types that can be handled outside of the coroutines. And then push back the data that it dosn't know how to handle. That works, but it still requires additional overhead to check those messages. And trampolines work by yielding generators, so they do require a bit of studying to understand how they work before you use them. How a ResumableException type would help... With a resumable exception type, we create a data path outside of functions and generators that is easy to parse. So inside the coroutines we only need to add a "raise ResumableException" to transfer control to the scheduler. And then in the scheduler it catches the exception, handles any message it may have, and saves it in a stack. And it can then resume each coroutine by possibly doing a "continue ResumableException". These could be easily sub-classed to create a scheduler ... while 1: thread = stack.popleft() try: continue thread except AddThread as e: ... except RemoveThread as e: ... except Block as e: ... except Unblock as e: ... except ResumableException as e: thread = e stack.append(thread) Where each of those is a sub-class of ResumableException. Notice that, the scheduler is completely separate from the data path, it doesn't need to get the data, test it, and refeed it back in like a trampoline does. It also doesn't need any if-else structure, and doesn't need to access any methods at the python level if the "continue" can take the exception directly. So it should be fast. A plain "raise ResumableExeption" could be auto continued if it isn't caught by default. That way you can write a generator that can work both in threads and by it self. (It's not an error.) There would be no change or overloading of "yield" to make these work as threads. "Yield from" should work as it was described just fine and would be complementary to this. It just makes the whole idea of coroutine based lightweight threads a whole lot easier to use and understand. I think this idea would also have uses in other asynchronous designs. But it definitely should be limited to a single exception type with the requirements I stated above. Cheers, Ron > However, with cofunctions there *is* a place where we > could create such an object -- it could be done by > costart(). We just need to allow it to be passed to > the places where it's needed, and with a brand-new > protocol we have the chance to do that. > > I'll give a sketch of what such a protocol might be > like in my next post. > From mwm at mired.org Mon Oct 31 18:59:56 2011 From: mwm at mired.org (Mike Meyer) Date: Mon, 31 Oct 2011 10:59:56 -0700 Subject: [Python-ideas] Concurrent safety? In-Reply-To: References: <20111030201143.481fdca2@bhuda.mired.org> Message-ID: On Sun, Oct 30, 2011 at 8:21 PM, Nick Coghlan wrote: > On Mon, Oct 31, 2011 at 1:11 PM, Mike Meyer wrote: >> The one glaring exception is in concurrent programs. While the tools >> python has for dealing with such are ok, there isn't anything to warn >> you when you fail to use those tools and should be. > > This will basically run into the same problem that > free-threading-in-CPython concepts do - the fine grained checks you > need to implement it will kill your single-threaded performance. These argument seems familiar. Oh, right, it's the "lack of performance will kill you." That was given as the reason that all of the following were unacceptable: - High level languages. - Byte-compiled languages. - Structured programming. - Automatic memory management. - Dynamic typing. - Object Oriented languages. All of those are believed (at least by their proponents) to make programming easier and/or faster at the cost of performance. The performance cost was "too high" when all of them when they were introduced, but they all became broadly accepted as the combination of increasing computing power (especially CPU support for them) and increasingly efficient implementation techniques drove that cost down to the point where it wasn't a problem except for very special cases. > Since Python is a scripting language that sees heavy single-threaded use, > that's not an acceptable trade-off. Right - few languages manage to grow one of those features without a name change of some sort, much less two (with the obvious exception of LISP). Getting them usually requires moving to a new language. That's one reason I said it might never make it into CPython. But the goal is to get people to think about fixing the problems, not dismiss the suggestion because of problems that will go away if we just wait long enough. For instance, the issue of single-threaded performance can be fixed by taking threading out of a library, and giving control of it to the interpreter. This possibility is why I said "thread of execution" instead of just "thread." If the interpreter knows when an application has concurrent execution, it also knows when there aren't any, so it can support an option not to make those checks until the performance issues go away. > Software transactional memory does offer some hope for a more > reasonable alternative, but that has its own problems (mainly I/O > related). It will be interesting to see how PyPy's experiments in this > space pan out. Right - you can't do I/O inside a transaction. For writes, this isn't a problem. For reads, it does, since they imply binding and/or rebinding. So an STM solution may require a second mechanism designed for single statements to allow reads to happen. References: <20111030201143.481fdca2@bhuda.mired.org> Message-ID: On Mon, Oct 31, 2011 at 7:33 AM, Amaury Forgeot d'Arc wrote: > 2011/10/31 Mike Meyer >> Any attempt to mutate an object that isn't currently locked will raise >> an exception. Possibly ValueError, possibly a new exception class just >> for this purpose. This includes rebinding attributes of objects that >> aren't locked. > PyPy offers a nice platform to play with this kind of concepts. > For example, even if it's not the best implementation, it's easy to add a > __setattr__ to the base W_Object class, which will check whether the object > is allowed to mutate. > But you certainly will open a can of worms here: even immutable objects are > modified (e.g str.__hash__ is cached) and many functions that you call will > need to add their own locks, is it possible to avoid deadlocks in this case? Just what I need - another project. I'll go take a look at PyPy. In theory, things that don't change the externally visible behavior of an object don't need to be checked. The goal isn't to make the language perfectly safe, it's to make you be explicit about when you want to do something that might not be safe in a concurrent environment. This provides a perfect example of where an immutable subclass would be useful. It would add an __setattr__ that throws an exception. Then the string class (which would inheret from the immutable subclass) could cache it's hash by doing something like W_Object.__setattr__(self, "_cashed_hash", hash(self)) (possibly locked). <4EA1AFB0.4080000@pearwood.info> <4EA27189.8010002@pearwood.info> <4EA32507.7010900@pearwood.info> <4EAAD51A.9030608@netwok.org> <4EAED684.1010707@stoneleaf.us> Message-ID: <20111031191711.24dfda8a@pitrou.net> On Mon, 31 Oct 2011 10:10:28 -0700 Ethan Furman wrote: > Guido van Rossum wrote: > > In my view, str() and repr() are both for human consumption > > I was under the impression that repr() was for eval consumption (when > possible). Only for basic types. repr() is generally not eval()-able. Regards Antoine. From solipsis at pitrou.net Mon Oct 31 19:25:43 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 31 Oct 2011 19:25:43 +0100 Subject: [Python-ideas] Concurrent safety? References: <20111030201143.481fdca2@bhuda.mired.org> Message-ID: <20111031192543.6fbbe6ae@pitrou.net> On Mon, 31 Oct 2011 10:59:56 -0700 Mike Meyer wrote: > On Sun, Oct 30, 2011 at 8:21 PM, Nick Coghlan wrote: > > On Mon, Oct 31, 2011 at 1:11 PM, Mike Meyer wrote: > >> The one glaring exception is in concurrent programs. While the tools > >> python has for dealing with such are ok, there isn't anything to warn > >> you when you fail to use those tools and should be. > > > > This will basically run into the same problem that > > free-threading-in-CPython concepts do - the fine grained checks you > > need to implement it will kill your single-threaded performance. > > These argument seems familiar. Oh, right, it's the "lack of > performance will kill you." That was given as the reason that all of > the following were unacceptable: Agreed, but killing performance by double digits in a new release is generally considered quite ugly by users. Also, I'm not convinced that such approaches really bring anything. My opinion is that good multi-threaded programming is achieved through careful abstraction and separation of concerns, rather than advanced language idioms. Regards Antoine. From mwm at mired.org Mon Oct 31 20:03:14 2011 From: mwm at mired.org (Mike Meyer) Date: Mon, 31 Oct 2011 12:03:14 -0700 Subject: [Python-ideas] Concurrent safety? In-Reply-To: <20111031192543.6fbbe6ae@pitrou.net> References: <20111030201143.481fdca2@bhuda.mired.org> <20111031192543.6fbbe6ae@pitrou.net> Message-ID: On Mon, Oct 31, 2011 at 11:25 AM, Antoine Pitrou wrote: > On Mon, 31 Oct 2011 10:59:56 -0700 > Mike Meyer wrote: >> On Sun, Oct 30, 2011 at 8:21 PM, Nick Coghlan wrote: >> > On Mon, Oct 31, 2011 at 1:11 PM, Mike Meyer wrote: >> >> The one glaring exception is in concurrent programs. While the tools >> >> python has for dealing with such are ok, there isn't anything to warn >> >> you when you fail to use those tools and should be. >> > >> > This will basically run into the same problem that >> > free-threading-in-CPython concepts do - the fine grained checks you >> > need to implement it will kill your single-threaded performance. >> >> These argument seems familiar. Oh, right, it's the "lack of >> performance will kill you." That was given as the reason that all of >> the following were unacceptable: > > Agreed, but killing performance by double digits in a new release is > generally considered quite ugly by users. That may be why languages rarely adopt such features. The users won't put up with the cost hit for the development versions. Except for LISP, of course, whose users know the value of everything but the cost of nothing :-). > Also, I'm not convinced that such approaches really bring anything. My > opinion is that good multi-threaded programming is achieved through > careful abstraction and separation of concerns, rather than advanced > language idioms. Doesn't that cover all kinds of good programming? But advanced language features are there because they are supposed to help with either abstraction or separation of concerns. Look at the list I presented again: - High level languages. - Byte-compiled languages. - Structured programming. - Automatic memory management. - Dynamic typing. - Object Oriented languages. All help with either abstraction or separation of concerns in some way (ok, byte-compilation the concerns are external, in that it's code portability). So do the features I'd like tosee. In particular, they let you separate code that in which concurrency is a concern from code where it isn't. Another aspect of this issue (and yet another possible reason that these features show up in new languages rather than being added to old ones) is that such changes usually require changing the way you think about programming. It takes a different mindset to program with while loops than with if & goto, or OO than procedural, or .... Similarly, it takes a different mindset to program in a language where changing an object requires special consideration. This may be to much of a straight-jacket for a multi-paradigm language like Python (though Oz manages it), but making the warnings ignorable defeats the purpose. <20111031192543.6fbbe6ae@pitrou.net> Message-ID: <20111031201428.0a5c99d2@pitrou.net> On Mon, 31 Oct 2011 12:03:14 -0700 Mike Meyer wrote: > > Doesn't that cover all kinds of good programming? But advanced > language features are there because they are supposed to help with > either abstraction or separation of concerns. Look at the list I > presented again: > > - High level languages. > - Byte-compiled languages. > - Structured programming. > - Automatic memory management. > - Dynamic typing. > - Object Oriented languages. > > All help with either abstraction or separation of concerns in some way > (ok, byte-compilation the concerns are external, in that it's code > portability). So do the features I'd like tosee. I don't think the latter is true. Manually listing objects that are allowed to be mutated seems actually quite low-level to me. Regards Antoine. From mwm at mired.org Mon Oct 31 20:31:50 2011 From: mwm at mired.org (Mike Meyer) Date: Mon, 31 Oct 2011 12:31:50 -0700 Subject: [Python-ideas] Concurrent safety? In-Reply-To: <20111031201428.0a5c99d2@pitrou.net> References: <20111030201143.481fdca2@bhuda.mired.org> <20111031192543.6fbbe6ae@pitrou.net> <20111031201428.0a5c99d2@pitrou.net> Message-ID: On Mon, Oct 31, 2011 at 12:14 PM, Antoine Pitrou wrote: > On Mon, 31 Oct 2011 12:03:14 -0700 > Mike Meyer wrote: >> >> Doesn't that cover all kinds of good programming? But advanced >> language features are there because they are supposed to help with >> either abstraction or separation of concerns. Look at the list I >> presented again: >> >> - High level languages. >> - Byte-compiled languages. >> - Structured programming. >> - Automatic memory management. >> - Dynamic typing. >> - Object Oriented languages. >> >> All help with either abstraction or separation of concerns in some way >> (ok, byte-compilation the concerns are external, in that it's code >> portability). So do the features I'd like tosee. > > I don't think the latter is true. Manually listing objects that are > allowed to be mutated seems actually quite low-level to me. Did you not read the next paragraph, the one where I explained how it helped separate issues? But you're right. Manually listing them isn't all that desirable, it's just better than what we have now. I initially saw a locking keyword with no list as implying that all such things should be locked automatically. I can't see how to do that with the locking implementation, so I dropped it. However, it can be done with an STM implementation. Hmm. That could be the distinguishing feature to deal with IO: If you don't have the list, you get an STM, and it does the copy/fingerprint dance when you mutate something. If you list a value, you get real locks so it won't retry and you can safely do IO. References: <20111030001801.2f52ceb2@pitrou.net> <20111031153320.1e69dc41@pitrou.net> Message-ID: On Mon, Oct 31, 2011 at 10:33 AM, Antoine Pitrou wrote: > On Mon, 31 Oct 2011 10:17:14 -0400 > Jim Jewett wrote: >> How meaningful are the extra two slots for every function or class object? > > It's only one extra slot per function or class. OK; I was thinking of both the object qname and its module's qname, but I agree that they are separable. >> Have you done benchmarks like the Unicode Changes PEP has? > I don't expect it to have any interesting impact. What benchmarks > do you have any mind? I would personally be satisfied with just timing the regression suite before and after, though others (Unladen Swallow?) may have more representative workloads. My biggest concern is that any memory increase (let alone 6%) may matter if it forces the use of extra cache lines. -jJ From solipsis at pitrou.net Mon Oct 31 20:54:06 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 31 Oct 2011 20:54:06 +0100 Subject: [Python-ideas] PEP 3155 - Qualified name for classes and functions In-Reply-To: References: <20111030001801.2f52ceb2@pitrou.net> <20111031153320.1e69dc41@pitrou.net> Message-ID: <1320090846.3382.10.camel@localhost.localdomain> > >> Have you done benchmarks like the Unicode Changes PEP has? > > > I don't expect it to have any interesting impact. What benchmarks > > do you have any mind? > > I would personally be satisfied with just timing the regression suite > before and after, though others (Unladen Swallow?) may have more > representative workloads. > > My biggest concern is that any memory increase (let alone 6%) may > matter if it forces the use of extra cache lines. I've just timed running the test suite (in non-debug mode) and the result (user CPU time) is not significantly different: 3m5s without the patch, 3m7s with it. pybench also shows similar results. Without patch: Test minimum average operation overhead ------------------------------------------------------------------------------- BuiltinFunctionCalls: 59ms 60ms 0.12us 0.164ms ComplexPythonFunctionCalls: 60ms 62ms 0.31us 0.275ms PythonFunctionCalls: 55ms 56ms 0.17us 0.165ms PythonMethodCalls: 74ms 74ms 0.33us 0.099ms ------------------------------------------------------------------------------- Totals: 248ms 252ms With patch: Test minimum average operation overhead ------------------------------------------------------------------------------- BuiltinFunctionCalls: 59ms 61ms 0.12us 0.163ms ComplexPythonFunctionCalls: 59ms 60ms 0.30us 0.273ms PythonFunctionCalls: 55ms 55ms 0.17us 0.164ms PythonMethodCalls: 77ms 78ms 0.35us 0.099ms ------------------------------------------------------------------------------- Totals: 251ms 254ms Regards Antoine. From brett at python.org Mon Oct 31 21:42:23 2011 From: brett at python.org (Brett Cannon) Date: Mon, 31 Oct 2011 13:42:23 -0700 Subject: [Python-ideas] PEP 395 - Module Aliasing In-Reply-To: References: Message-ID: I read until the solution for pickling (since I just don't care about that =) and it's all LGTM. I also read "unobvious" as "obnoxious" which I thought was fitting. =) On Sat, Oct 29, 2011 at 23:07, Nick Coghlan wrote: > I've updated the module aliasing PEP to be based on the terminology in > Antoine's qualified names PEP. > > The full text is included below, or you can read it on python.org: > http://www.python.org/dev/peps/pep-0395/ > > Cheers, > Nick. > > ================================ > PEP: 395 > Title: Module Aliasing > Version: $Revision$ > Last-Modified: $Date$ > Author: Nick Coghlan > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 4-Mar-2011 > Python-Version: 3.3 > Post-History: 5-Mar-2011, 30-Oct-2011 > > > Abstract > ======== > > This PEP proposes new mechanisms that eliminate some longstanding traps for > the unwary when dealing with Python's import system, the pickle module and > introspection interfaces. > > It builds on the "Qualified Name" concept defined in PEP 3155. > > > What's in a ``__name__``? > ========================= > > Over time, a module's ``__name__`` attribute has come to be used to handle > a > number of different tasks. > > The key use cases identified for this module attribute are: > > 1. Flagging the main module in a program, using the ``if __name__ == > "__main__":`` convention. > 2. As the starting point for relative imports > 3. To identify the location of function and class definitions within the > running application > 4. To identify the location of classes for serialisation into pickle > objects > which may be shared with other interpreter instances > > > Traps for the Unwary > ==================== > > The overloading of the semantics of ``__name__`` have resulted in several > traps for the unwary. These traps can be quite annoying in practice, as > they are highly unobvious and can cause quite confusing behaviour. A lot of > the time, you won't even notice them, which just makes them all the more > surprising when they do come up. > > > Importing the main module twice > ------------------------------- > > The most venerable of these traps is the issue of (effectively) importing > ``__main__`` twice. This occurs when the main module is also imported under > its real name, effectively creating two instances of the same module under > different names. > > This problem used to be significantly worse due to implicit relative > imports > from the main module, but the switch to allowing only absolute imports and > explicit relative imports means this issue is now restricted to affecting > the > main module itself. > > > Why are my relative imports broken? > ----------------------------------- > > PEP 366 defines a mechanism that allows relative imports to work correctly > when a module inside a package is executed via the ``-m`` switch. > > Unfortunately, many users still attempt to directly execute scripts inside > packages. While this no longer silently does the wrong thing by > creating duplicate copies of peer modules due to implicit relative > imports, it > now fails noisily at the first explicit relative import, even though the > interpreter actually has sufficient information available on the > filesystem to > make it work properly. > > find > to put here if I really went looking?> > > > In a bit of a pickle > -------------------- > > Something many users may not realise is that the ``pickle`` module > serialises > objects based on the ``__name__`` of the containing module. So objects > defined in ``__main__`` are pickled that way, and won't be unpickled > correctly by another python instance that only imported that module instead > of running it directly. This behaviour is the underlying reason for the > advice from many Python veterans to do as little as possible in the > ``__main__`` module in any application that involves any form of object > serialisation and persistence. > > Similarly, when creating a pseudo-module\*, pickles rely on the name of the > module where a class is actually defined, rather than the officially > documented location for that class in the module hierarchy. > > While this PEP focuses specifically on ``pickle`` as the principal > serialisation scheme in the standard library, this issue may also affect > other mechanisms that support serialisation of arbitrary class instances. > > \*For the purposes of this PEP, a "pseudo-module" is a package designed > like > the Python 3.2 ``unittest`` and ``concurrent.futures`` packages. These > packages are documented as if they were single modules, but are in fact > internally implemented as a package. This is *supposed* to be an > implementation detail that users and other implementations don't need to > worry > about, but, thanks to ``pickle`` (and serialisation in general), the > details > are exposed and effectively become part of the public API. > > > Where's the source? > ------------------- > > Some sophisticated users of the pseudo-module technique described > above recognise the problem with implementation details leaking out via the > ``pickle`` module, and choose to address it by altering ``__name__`` to > refer > to the public location for the module before defining any functions or > classes > (or else by modifying the ``__module__`` attributes of those objects after > they have been defined). > > This approach is effective at eliminating the leakage of information via > pickling, but comes at the cost of breaking introspection for functions and > classes (as their ``__module__`` attribute now points to the wrong place). > > > Forkless Windows > ---------------- > > To get around the lack of ``os.fork`` on Windows, the ``multiprocessing`` > module attempts to re-execute Python with the same main module, but > skipping > over any code guarded by ``if __name__ == "__main__":`` checks. It does the > best it can with the information it has, but is forced to make assumptions > that simply aren't valid whenever the main module isn't an ordinary > directly > executed script or top-level module. Packages and non-top-level modules > executed via the ``-m`` switch, as well as directly executed zipfiles or > directories, are likely to make multiprocessing on Windows do the wrong > thing > (either quietly or noisily) when spawning a new process. > > While this issue currently only affects Windows directly, it also impacts > any proposals to provide Windows-style "clean process" invocation via the > multiprocessing module on other platforms. > > > Proposed Changes > ================ > > The following changes are interrelated and make the most sense when > considered together. They collectively either completely eliminate the > traps > for the unwary noted above, or else provide straightforward mechanisms for > dealing with them. > > A rough draft of some of the concepts presented here was first posted on > the > python-ideas list [1], but they have evolved considerably since first being > discussed in that thread. > > > Fixing dual imports of the main module > -------------------------------------- > > Two simple changes are proposed to fix this problem: > > 1. In ``runpy``, modify the implementation of the ``-m`` switch handling to > install the specified module in ``sys.modules`` under both its real name > and the name ``__main__``. (Currently it is only installed as the latter) > 2. When directly executing a module, install it in ``sys.modules`` under > ``os.path.splitext(os.path.basename(__file__))[0]`` as well as under > ``__main__``. > > With the main module also stored under its "real" name, attempts to import > it > will pick it up from the ``sys.modules`` cache rather than reimporting it > under the new name. > > > Fixing direct execution inside packages > --------------------------------------- > > To fix this problem, it is proposed that an additional filesystem check be > performed before proceeding with direct execution of a ``PY_SOURCE`` or > ``PY_COMPILED`` file that has been named on the command line. > > This additional check would look for an ``__init__`` file that is a peer to > the specified file with a matching extension (either ``.py``, ``.pyc`` or > ``.pyo``, depending what was passed on the command line). > > If this check fails to find anything, direct execution proceeds as usual. > > If, however, it finds something, execution is handed over to a > helper function in the ``runpy`` module that ``runpy.run_path`` also > invokes > in the same circumstances. That function will walk back up the > directory hierarchy from the supplied path, looking for the first directory > that doesn't contain an ``__init__`` file. Once that directory is found, it > will be set to ``sys.path[0]``, ``sys.argv[0]`` will be set to ``-m`` and > ``runpy._run_module_as_main`` will be invoked with the appropriate module > name (as calculated based on the original filename and the directories > traversed while looking for a directory without an ``__init__`` file). > > The two current PEPs for namespace packages (PEP 382 and PEP 402) would > both > affect this part of the proposal. For PEP 382 (with its current suggestion > of > "*.pyp" package directories, this check would instead just walk up the > supplied path, looking for the first non-package directory (this would not > require any filesystem stat calls). Since PEP 402 deliberately omits > explicit > directory markers, it would need an alternative approach, based on checking > the supplied path against the contents of ``sys.path``. In both cases, the > direct execution behaviour can still be corrected. > > > Fixing pickling without breaking introspection > ---------------------------------------------- > > To fix this problem, it is proposed to add a new optional module level > attribute: ``__qname__``. This abbreviation of "qualified name" is taken > from PEP 3155, where it is used to store the naming path to a nested class > or function definition relative to the top level module. By default, > ``__qname__`` will be the same as ``__name__``, which covers the typical > case where there is a one-to-one correspondence between the documented API > and the actual module implementation. > > Functions and classes will gain a corresponding ``__qmodule__`` attribute > that refers to their module's ``__qname__``. > > Pseudo-modules that adjust ``__name__`` to point to the public namespace > will > leave ``__qname__`` untouched, so the implementation location remains > readily > accessible for introspection. > > In the main module, ``__qname__`` will automatically be set to the main > module's "real" name (as described above under the fix to prevent duplicate > imports of the main module) by the interpreter. > > At the interactive prompt, both ``__name__`` and ``__qname__`` will be set > to ``"__main__"``. > > These changes on their own will fix most pickling and serialisation > problems, > but one additional change is needed to fix the problem with serialisation > of > items in ``__main__``: as a slight adjustment to the definition process for > functions and classes, in the ``__name__ == "__main__"`` case, the module > ``__qname__`` attribute will be used to set ``__module__``. > > ``pydoc`` and ``inspect`` would also be updated appropriately to: > - use ``__qname__`` instead of ``__name__`` and ``__qmodule__`` instead of > ``__module__``where appropriate (e.g. ``inspect.getsource()`` would prefer > the qualified variants) > - report both the public names and the qualified names for affected objects > > Fixing multiprocessing on Windows > --------------------------------- > > With ``__qname__`` now available to tell ``multiprocessing`` the real > name of the main module, it should be able to simply include it in the > serialised information passed to the child process, eliminating the > need for dubious reverse engineering of the ``__file__`` attribute. > > > Reference Implementation > ======================== > > None as yet. > > > References > ========== > > .. [1] Module aliases and/or "real names" > (http://mail.python.org/pipermail/python-ideas/2011-January/008983.html) > > > Copyright > ========= > > This document has been placed in the public domain. > > .. > Local Variables: > mode: indented-text > indent-tabs-mode: nil > sentence-end-double-space: t > fill-column: 70 > End: > > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimjjewett at gmail.com Mon Oct 31 21:47:18 2011 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 31 Oct 2011 16:47:18 -0400 Subject: [Python-ideas] Concurrent safety? In-Reply-To: <20111030201143.481fdca2@bhuda.mired.org> References: <20111030201143.481fdca2@bhuda.mired.org> Message-ID: On Sun, Oct 30, 2011 at 11:11 PM, Mike Meyer wrote: > Python, as a general rule, tries to be "safe" about things. If > something isn't obviously correct, it tends to throw runtime errors to > let you know that you need to be explicit about what you want. ?... > The one glaring exception is in concurrent programs. ... > Object semantics don't need to change very much. The existing > immutable types will work well in this environment exactly as is. I think a state bit in the object header would be more than justified, if we could define immutability. Are strings immutable? Do they become immutable after the hash is cached and the string is marked Ready? Can a subtype of strings with mutable attributes (that are not involved in comparison?) still be considered immutable? > The protection mechanism is the change to the language. I propose a > single new keyword, "locking", that acts similar to the "try" > keyword. The syntax is: > 'locking' value [',' value]*':' suite > ... The list of values are the objects that can be mutated in this > lock. You could already simulate this with a context manager ... it won't give you all the benefits (though with a lint-style tool, it might), but it will showcase the costs. (In terms of code beauty, not performance.) Personally, I think those costs would be too high, given the current memory model, but I suppose that is at least partly an empirical question. If there isn't a way to generate this list automatically, it is the equivalent of manual memory management. -jJ From arnodel at gmail.com Mon Oct 31 21:51:54 2011 From: arnodel at gmail.com (Arnaud Delobelle) Date: Mon, 31 Oct 2011 20:51:54 +0000 Subject: [Python-ideas] Cofunctions - Getting away from the iterator protocol In-Reply-To: <4EAE5F83.9040305@canterbury.ac.nz> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> <4EA9FED3.6050505@pearwood.info> <4EADBEA7.9000608@canterbury.ac.nz> <4EAE5F83.9040305@canterbury.ac.nz> Message-ID: On 31 October 2011 08:42, Greg Ewing wrote: > Thinking about how to support cofunctions iterating over > generators that are themselves cofunctions and therefore > suspendable, I've come to realise that cofunctionness > and generatorness really need to be orthogonal concepts. > As well as functions and cofunctions, we need generators > and cogenerators. > > And thinking about how to allow *that* while building > on the yield-from mechanism, I found myself inventing > what amounts to a complete parallel implementation of > the iterator protocol, in which cogenerators create > coiterators having a __conext__ method that raises > CoStopIteration when finished, etc, etc... :-\ > > At which point I thought, well, why not forget about > using the iterator protocol as a basis altogether, and > design a new protocol specifically designed for the > purpose? > > About then I also remembered a thought I had in passing > earlier, when Nick was talking about the fact that, > when using yield-from, there is no object available that > can hold a stack of suspended generators, so you end > up traversing an ever-lengthening chain of generator > frames as the call stack gets deeper. > > However, with cofunctions there *is* a place where we > could create such an object -- it could be done by > costart(). We just need to allow it to be passed to > the places where it's needed, and with a brand-new > protocol we have the chance to do that. Is it worth noting that the implementation in Python (*) that I posted earlier on does just that? The costart generator function, when initiated, keeps track of the stack of suspended generators so that only the current one is accessed at any time. -- Arnaud (*) http://www.marooned.org.uk/~arno/cofunctions/cofunctions.py.html From jimjjewett at gmail.com Mon Oct 31 21:55:42 2011 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 31 Oct 2011 16:55:42 -0400 Subject: [Python-ideas] PEP 3155 - Qualified name for classes and functions In-Reply-To: <1320090846.3382.10.camel@localhost.localdomain> References: <20111030001801.2f52ceb2@pitrou.net> <20111031153320.1e69dc41@pitrou.net> <1320090846.3382.10.camel@localhost.localdomain> Message-ID: On Mon, Oct 31, 2011 at 3:54 PM, Antoine Pitrou wrote: >> I would personally be satisfied with just timing the regression suite >> before and after, though others (Unladen Swallow?) may have more >> representative workloads. ... > I've just timed running the test suite (in non-debug mode) and the > result (user CPU time) is not significantly different: 3m5s without the > patch, 3m7s with it. > pybench also shows similar results. Great; I would agree that the costs appear minimal; please include these results in the PEP for posterity, whatever the decision. -jJ From mwm at mired.org Mon Oct 31 22:06:56 2011 From: mwm at mired.org (Mike Meyer) Date: Mon, 31 Oct 2011 14:06:56 -0700 Subject: [Python-ideas] Concurrent safety? In-Reply-To: References: <20111030201143.481fdca2@bhuda.mired.org> Message-ID: On Mon, Oct 31, 2011 at 1:47 PM, Jim Jewett wrote: > On Sun, Oct 30, 2011 at 11:11 PM, Mike Meyer wrote: > > immutable types will work well in this environment exactly as is. > If there isn't a way to generate this list automatically, it is the > equivalent of manual memory management. No, it isn't. The difference is in how mistakes are handled. With manual memory management, references through unassigned or freed pointers are mistakes, but may not generate an error immediately. In fact, it's possible the program will run fine and pass all your unit tests. This is the situation we have now in concurrent programming: mutating a shared object without an appropriate lock is an error that probably passes silently, and it may well pass all your tests without a problem (constructing a test to reliably trigger such a big is an interesting problem in and of itself). While you can automatically manage memory, there are other resources that still have to be managed by hand (open files spring to mind). In some cases you might be able to handle them completely automatically, in others not. In either case, Python manages things so that reading from a file that hasn't been opened is impossible, and reading from one that has been closed generates an immediate error. The goal here is to move from where we are to a place similar to where handling files is, so that failing to properly deal with the possibility of concurrent access causes an error when it happens, not at a point distant in both time and space. BTW, regarding the performance issue. I figured out how to implement this so that the run time cost is zero aside from the lock & unlock steps. From greg.ewing at canterbury.ac.nz Mon Oct 31 22:06:54 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 01 Nov 2011 10:06:54 +1300 Subject: [Python-ideas] Cofunctions - Getting away from the iterator protocol In-Reply-To: <1320083850.5984.115.camel@Gutsy> References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> <4EA9FED3.6050505@pearwood.info> <4EADBEA7.9000608@canterbury.ac.nz> <4EAE5F83.9040305@canterbury.ac.nz> <1320083850.5984.115.camel@Gutsy> Message-ID: <4EAF0DEE.1020500@canterbury.ac.nz> Ron Adam wrote: > If we put some strict requirements on the idea. > > 1. Only have a *SINGLE* exception type as being resumable. > > 2. That exception should *NEVER* occur naturally. > > 3. Only allow continuing after it's *EXPLICITLY RAISED* by a > raised statement. > > All of the problem issues go away with those requirements in place, and > you only have the issue of how to actually write the patch. Earlier > discussions indicated, it might not be that hard to do. I'm not familiar with these earlier discussions. Did they go as far as sketching a feasible implementation? It's all very well to propose things like this, but the devil is very much in the details. -- Greg From greg.ewing at canterbury.ac.nz Mon Oct 31 22:41:38 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 01 Nov 2011 10:41:38 +1300 Subject: [Python-ideas] Cofunctions - Getting away from the iterator protocol In-Reply-To: References: <4EA8BD66.6010807@canterbury.ac.nz> <4EA94C53.2060209@pearwood.info> <20111027183208.GH20970@pantoffel-wg.de> <4EA9AB03.8070302@stoneleaf.us> <4EA9FED3.6050505@pearwood.info> <4EADBEA7.9000608@canterbury.ac.nz> <4EAE5F83.9040305@canterbury.ac.nz> Message-ID: <4EAF1612.1020102@canterbury.ac.nz> Arnaud Delobelle wrote: > Is it worth noting that the implementation in Python (*) that I posted > earlier on does just that? The costart generator function, when > initiated, keeps track of the stack of suspended generators so that > only the current one is accessed at any time. Yes, I know that this can and has been done before using generators. The important thing is that I'll be using a new protocol that's separate from the generator protocol, making it possible to use both together in a staightforward and efficient way. -- Greg From ncoghlan at gmail.com Mon Oct 31 22:52:18 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 1 Nov 2011 07:52:18 +1000 Subject: [Python-ideas] PEP 3155 - Qualified name for classes and functions In-Reply-To: References: <20111030001801.2f52ceb2@pitrou.net> <20111031153320.1e69dc41@pitrou.net> Message-ID: On Tue, Nov 1, 2011 at 5:37 AM, Jim Jewett wrote: > On Mon, Oct 31, 2011 at 10:33 AM, Antoine Pitrou wrote: >> On Mon, 31 Oct 2011 10:17:14 -0400 >> Jim Jewett wrote: >>> How meaningful are the extra two slots for every function or class object? >> >> It's only one extra slot per function or class. > > OK; I was thinking of both the object qname and its module's qname, > but I agree that they are separable. Yeah, adding __qmodule__ is part of the module aliasing PEP (395) rather than this one. It's technically not needed for functions (you could do f.func_globals["__qname__"] instead), but classes definitely need it in order to kill off the corner cases that currently force developers to choose between breaking serialisation and breaking pickling. The main differences between __qname__ and __qmodule__ and the Unicode changes are that there are a *lot* of small strings in any Python application, and an additional pointer or two can make a reasonably significant difference to their size, but classes and functions are already relatively complex objects. On trunk: >>> def f(): pass ... >>> class C(): pass ... >>> import sys >>> sys.getsizeof("") 60 >>> sys.getsizeof(f) 136 >>> sys.getsizeof(C) 832 (And those numbers don't even take into account the size of automatically created subobjects like C.__dict__, f.__dict__, f.__code__, etc) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon Oct 31 23:11:55 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 1 Nov 2011 08:11:55 +1000 Subject: [Python-ideas] PEP 3155 - Qualified name for classes and functions In-Reply-To: <20111031165003.7cb104d3@pitrou.net> References: <20111030001801.2f52ceb2@pitrou.net> <20111031165003.7cb104d3@pitrou.net> Message-ID: On Tue, Nov 1, 2011 at 1:50 AM, Antoine Pitrou wrote: > On Mon, 31 Oct 2011 16:42:00 +0100 > Dirkjan Ochtman wrote: >> On Sun, Oct 30, 2011 at 00:18, Antoine Pitrou wrote: >> > I would like to propose the following PEP for discussion and, if >> > possible, acceptance. I think the proposal shouldn't be too >> > controversial (I find it quite simple and straightforward myself :-)). >> >> Are these names relative or fully absolute? I.e. I've had problems in >> the past with unpickling objects that were pickled from a module that >> was imported using a relative import. Would it be possible to define >> the qname such that the full path to the name, starting from a >> sys.path level down, is always used? > > The __qname__, by design, doesn't include any module name. To get the > "full path", you still have to add in the __module__ attribute. > Solving the problems with relative imports (I'm not sure what they are) > is another problem which Nick is apparently tackling (?). PEP 395 (Module Aliasing) is indeed intended to handle the module naming half of the serialisation issues. However, relative imports shouldn't cause pickling problems - the import machinery works out the full name and stores that in __name__ (if there is any case in 3.x where it doesn't, then that's a bug). That said, *applications* (such as Django, up to and including v1.3) can corrupt __name__ values by placing package (or subpackage) directories directly on sys.path. Once an application does that, they run the risk of getting multiple copies of the same module with different __name__ values, and serialisation will then depend on exactly which version of the module was used to create the instances being serialised. Django, at least, is going to stop doing that by default in 1.4, but it's still one of the easiest ways for an application to get itself in trouble when it comes to serialisation. And, because it's an outright application bug, there isn't really a lot the interpreter can do to correct for it (or even warn about it - however, see http://bugs.python.org/issue13306). You can also get a similar effect in 2.x by running a file directly from a package when that file uses implicit relative imports. In that case, it's really the implicit relative imports that are at fault, though - there's a reason we killed them in Python 3. If you try the same thing with explicit relative imports, you at least get a noisy failure (and, with the module aliasing PEP, the plan is to just "Do The Right Thing" instead of complaining - there's no ambiguity in what the user is asking for, we just need to do some additional coding to actually make it happen). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From bruce at leapyear.org Mon Oct 31 23:58:47 2011 From: bruce at leapyear.org (Bruce Leban) Date: Mon, 31 Oct 2011 15:58:47 -0700 Subject: [Python-ideas] Concurrent safety? In-Reply-To: <20111030201143.481fdca2@bhuda.mired.org> References: <20111030201143.481fdca2@bhuda.mired.org> Message-ID: On Sun, Oct 30, 2011 at 8:11 PM, Mike Meyer wrote: > Any attempt to mutate an object that isn't currently locked will raise > an exception. Possibly ValueError, possibly a new exception class just > for this purpose. This includes rebinding attributes of objects that > aren't locked. > Do you mean that at any time attempting to mutate an unlocked object throws an exception? That would mean that all of my current code is broken. Do you mean, that inside the control of 'locking', you can't mutate an unlocked object? That still breaks lots of code that is safe. You can't use itertools.cycle anymore until that's updated in a completely unnecessary way: def cycle(iterable): # cycle('ABCD') --> A B C D A B C D A B C D ... saved = [] for element in iterable: yield element saved.append(element) *# throws an exception when called on a locked iterable* while saved: for element in saved: yield element I think the semantics of this need to be tightened up. Furthermore, merely *reading* an object that isn't locked can cause problems. This code is not thread-safe: if element in dictionary: return dictionary[element] so you have to decide how much safety you want and what cost we're willing to pay for this. --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: