From greg.ewing at canterbury.ac.nz Fri Oct 3 12:37:39 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 03 Oct 2008 22:37:39 +1200 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake Message-ID: <48E5F5F3.3040000@canterbury.ac.nz> There's been another discussion on c.l.py about the problem of lst = [] for i in range(10): lst.append(lambda: i) for f in lst: print f() printing 9 ten times instead of 0 to 9. The usual response is to do lst.append(lambda i=i: i) but this is not a very satisfying solution. For one thing, it's still abusing default arguments, something that lexical scoping was supposed to have removed the need for, and it won't work in some situations, such as if the function needs to take a variable number of arguments. Also, most other languages which have lexical scoping and first-class functions don't seem to suffer from problems like this. To someone familiar with one of those languages (e.g. Scheme, Haskell) it looks as if there's something broken about the way scoping of nested functions works in Python. However, it's not lambda that's broken, it's the for loop. In Scheme, for example, the way you normally write the equivalent of a Python for-loop results in a new scope being created for each value of the loop variable. Previous proposals to make for-loop variables local to the loop have stumbled on the problem of existing code that relies on the loop variable keeping its value after exiting the loop, and it seems that this is regarded as a desirable feature. So I'd like to propose something that would satisfy both requirements: 0. There is no change if the loop variable is not referenced by a nested function defined in the loop body. The vast majority of loop code will therefore be completely unaffected. 1. If the loop variable is referenced by such a nested function, a new local scope is effectively created to hold each successive value of the loop variable. 2. Upon exiting the loop, the final value of the loop variable is copied into the surrounding scope, for use by code outside the loop body. Rules 0 and 1 would also apply to list comprehensions and generator expressions. There is a very simple and efficient way to implement this in current CPython: If the loop variable is referenced by a nested function, it will be in a cell. Instead of rebinding the existing cell, each time around the loop a new cell is created, replacing the previous cell. Immediately before exiting the loop, one more new cell is created and the final value of the loop variable copied into it. Implementations other than CPython that aren't using cells may need to do something more traditional, such as compiling the loop body as a separate function. I think this arrangement would allow almost all existing code to continue working, and new code to be written that takes advantage of final values of loop variables. There would be a few obscure situations where the results would be different, for example if a nested function modifies the loop variable and expects the result to be reflected in the value seen from outside the loop. But I can't imagine such cases being anything other than extremely rare. The benefit would be that almost all code involving loops and nested functions would behave intuitively, Python would free itself from any remaining perception of having broken scope rules, and we would finally be able to consign the default-argument hack to the garbage collector of history. -- Greg From adde at trialcode.com Fri Oct 3 13:17:53 2008 From: adde at trialcode.com (Andreas Nilsson) Date: Fri, 3 Oct 2008 13:17:53 +0200 Subject: [Python-ideas] if-syntax for regular for-loops Message-ID: Hi. I'm reposting this here after erroneously posting it on python-dev. I use list comprehensions and generator expressions a lot and lately I've found myself writing a lot of code like this: for i in items if i.some_field == some_value: i.do_something() Naturally it won't work but it seems like a pretty straight-forward extension to allow compressing simple loops to fit on one line. The alternative, in my eyes, suggests there's something more happening than a simple include-test which makes it harder to comprehend. for i in items: if i.some_field == some_value: i.do_something() One possibility of course is to use a generator-expression but that makes it look like there are two for loops and it feels like a waste setting up a generator just for filtering. for i in (i for i in items if some_field == some_value): i.do_something() Stupid idea? Am I missing some obviously better way of achieving the same result? Thanks, Adde From phd at phd.pp.ru Fri Oct 3 12:54:17 2008 From: phd at phd.pp.ru (Oleg Broytmann) Date: Fri, 3 Oct 2008 14:54:17 +0400 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <48E5F5F3.3040000@canterbury.ac.nz> References: <48E5F5F3.3040000@canterbury.ac.nz> Message-ID: <20081003105417.GB3561@phd.pp.ru> On Fri, Oct 03, 2008 at 10:37:39PM +1200, Greg Ewing wrote: > lst = [] > for i in range(10): > lst.append(lambda: i) > for f in lst: > print f() > > printing 9 ten times instead of 0 to 9. I lost count how many times I've stumbled upon that wart. > There is a very simple and efficient way to implement > this in current CPython: If the loop variable is referenced > by a nested function, it will be in a cell. Instead of > rebinding the existing cell, each time around the loop > a new cell is created, replacing the previous cell. > Immediately before exiting the loop, one more new cell > is created and the final value of the loop variable > copied into it. [skip] > The benefit would be that almost all code involving > loops and nested functions would behave intuitively, +1 Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From arnodel at googlemail.com Fri Oct 3 14:00:50 2008 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Fri, 3 Oct 2008 13:00:50 +0100 Subject: [Python-ideas] if-syntax for regular for-loops In-Reply-To: References: Message-ID: <9bfc700a0810030500r4c78ebccobe90c11bcc9bdd48@mail.gmail.com> 2008/10/3 Andreas Nilsson : > Hi. > I'm reposting this here after erroneously posting it on python-dev. > > I use list comprehensions and generator expressions a lot and lately I've > found myself writing a lot of code like this: > > for i in items if i.some_field == some_value: i.do_something() I'm pretty sure this has been proposed before and that the consensus was that there was no advantage to writing: for i in L if cond: action() instead of: for i in L: if cond: action() -- Arnaud From aahz at pythoncraft.com Fri Oct 3 16:31:45 2008 From: aahz at pythoncraft.com (Aahz) Date: Fri, 3 Oct 2008 07:31:45 -0700 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <48E5F5F3.3040000@canterbury.ac.nz> References: <48E5F5F3.3040000@canterbury.ac.nz> Message-ID: <20081003143145.GD14124@panix.com> On Fri, Oct 03, 2008, Greg Ewing wrote: > > Previous proposals to make for-loop variables local to the loop have > stumbled on the problem of existing code that relies on the loop > variable keeping its value after exiting the loop, and it seems that > this is regarded as a desirable feature. Very yes, I use this constantly. > So I'd like to propose something that would satisfy > both requirements: > > 0. There is no change if the loop variable is not > referenced by a nested function defined in the loop > body. The vast majority of loop code will therefore > be completely unaffected. > > 1. If the loop variable is referenced by such a nested > function, a new local scope is effectively created > to hold each successive value of the loop variable. > > 2. Upon exiting the loop, the final value of the loop > variable is copied into the surrounding scope, for > use by code outside the loop body. > > Rules 0 and 1 would also apply to list comprehensions > and generator expressions. > > There is a very simple and efficient way to implement > this in current CPython: If the loop variable is referenced > by a nested function, it will be in a cell. Instead of > rebinding the existing cell, each time around the loop > a new cell is created, replacing the previous cell. > Immediately before exiting the loop, one more new cell > is created and the final value of the loop variable > copied into it. > > Implementations other than CPython that aren't using > cells may need to do something more traditional, such > as compiling the loop body as a separate function. You'll need to make sure that exceptions don't break this. I'm not sure to what extent the current test suite covers the current behavior, I think that beefing it up is a necessary precondition to trying this approach. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "...if I were on life-support, I'd rather have it run by a Gameboy than a Windows box." --Cliff Wells, comp.lang.python, 3/13/2002 From and-dev at doxdesk.com Fri Oct 3 21:42:24 2008 From: and-dev at doxdesk.com (Andrew Clover) Date: Fri, 03 Oct 2008 21:42:24 +0200 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <48E5F5F3.3040000@canterbury.ac.nz> References: <48E5F5F3.3040000@canterbury.ac.nz> Message-ID: <48E675A0.9090501@doxdesk.com> Greg Ewing wrote: > 1. If the loop variable is referenced by such a nested > function, a new local scope is effectively created > to hold each successive value of the loop variable. Why only the loop variable? What about: >>> for i in range(3): ... j= str(i) ... funs.append(lambda: j) >>> funs[0]() '2' It seems an odd sort of scope that lets rebindings inside it fall through outwards. > Also, most other languages which have lexical scoping > and first-class functions don't seem to suffer from > problems like this. That's not always because of anything to do with introducing scopes though. Some of those languages can bind the variable value early, so if you were to write the equivalent of: >>> i= 3 >>> f= lambda: i >>> i= 4 f() would give you 3. I would love to see an early-value-binding language with similar syntax to Python, but I can't see how to get there from here. There are other languages with lexical scope and late value binding, such as JavaScript; their for loops behave the same as Python. > 0. There is no change if the loop variable is not > referenced by a nested function defined in the loop > body. You mean: >>> i= 0 >>> geti= lambda: i >>> for i in [1]: ... print i is geti() True >>> for i in [1]: ... dummy= lambda: i ... print i is geti() False This seems unconscionably spooky to me. Explicit is better than yadda yadda. How about explicitly requesting to be given a new scope each time around the loop? This would clear up the compatibility problems. >>> for local i in range(3): ... funs.append(lambda: i) ... q= 3 >>> funs[0]() 0 >>> q NameError Or similar syntax as preferred. -- And Clover mailto:and at doxdesk.com http://www.doxdesk.com/ From guido at python.org Fri Oct 3 22:58:47 2008 From: guido at python.org (Guido van Rossum) Date: Fri, 3 Oct 2008 13:58:47 -0700 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <48E5F5F3.3040000@canterbury.ac.nz> References: <48E5F5F3.3040000@canterbury.ac.nz> Message-ID: On Fri, Oct 3, 2008 at 3:37 AM, Greg Ewing wrote: > However, it's not lambda that's broken, it's the for loop. I disagree. If you propose to change the for-loop to create new cells, you would also need to introduce new syntax for introducing new cells in other contexts. While it is common (especially when demonstrating the problem) to use a for loop variable in the lambda, the same problem exists when the variable referenced is constructed via other means. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ntoronto at cs.byu.edu Sat Oct 4 00:51:27 2008 From: ntoronto at cs.byu.edu (Neil Toronto) Date: Fri, 03 Oct 2008 16:51:27 -0600 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: References: <48E5F5F3.3040000@canterbury.ac.nz> Message-ID: <48E6A1EF.8000809@cs.byu.edu> Guido van Rossum wrote: > On Fri, Oct 3, 2008 at 3:37 AM, Greg Ewing wrote: >> However, it's not lambda that's broken, it's the for loop. > > I disagree. If you propose to change the for-loop to create new cells, > you would also need to introduce new syntax for introducing new cells > in other contexts. While it is common (especially when demonstrating > the problem) to use a for loop variable in the lambda, the same > problem exists when the variable referenced is constructed via other > means. Like this? >>> i = 0 >>> f = lambda: i >>> i = 1 >>> f() 1 Whether the for loop is broken really depends on what's meant by "broken". Some would say the very idea of cells is broken because *values* ought to be closed over. That's bunk, of course. ;) But I think most people are comfortable with the above example. If you unroll the loop, current behavior makes perfect sense: >>> f = [] >>> for i in [0, 1]: f.append(lambda: i) ... >>> [g() for g in f] [1, 1] >>> f = [] >>> i = 0 >>> f.append(lambda: i) >>> i = 1 >>> f.append(lambda: i) >>> [g() for g in f] [1, 1] But the deal with loops is that we don't usually unroll them in our heads when we reason about their behavior. We generally consider each iteration in isolation, or as an abstract single iteration. (After all, it only appears once in the program text.) If iterations need to depend on each other, it's almost always done via something that already exists outside the loop. The semantics that best matches that kind of reasoning is fully scoped loops. Python 4000 for scoped blocks? Neil From guido at python.org Sat Oct 4 01:06:54 2008 From: guido at python.org (Guido van Rossum) Date: Fri, 3 Oct 2008 16:06:54 -0700 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <48E6A1EF.8000809@cs.byu.edu> References: <48E5F5F3.3040000@canterbury.ac.nz> <48E6A1EF.8000809@cs.byu.edu> Message-ID: On Fri, Oct 3, 2008 at 3:51 PM, Neil Toronto wrote: > Guido van Rossum wrote: >> >> On Fri, Oct 3, 2008 at 3:37 AM, Greg Ewing >> wrote: >>> >>> However, it's not lambda that's broken, it's the for loop. >> >> I disagree. If you propose to change the for-loop to create new cells, >> you would also need to introduce new syntax for introducing new cells >> in other contexts. While it is common (especially when demonstrating >> the problem) to use a for loop variable in the lambda, the same >> problem exists when the variable referenced is constructed via other >> means. > > Like this? > > >>> i = 0 > >>> f = lambda: i > >>> i = 1 > >>> f() > 1 No, I was thinking of examples like this: >>> a = [] >>> for i in range(10): ... j = i**2 ... a.append(lambda: j) ... >>> for f in a: print f() 81 81 . . . 81 >>> This leads me to reject claims that "the for-loop is broken" and in particular clamoring for fixing the for-loop without allowing us to fix this example. > Whether the for loop is broken really depends on what's meant by "broken". > Some would say the very idea of cells is broken because *values* ought to be > closed over. That's bunk, of course. ;) But I think most people are > comfortable with the above example. > > If you unroll the loop, current behavior makes perfect sense: > > >>> f = [] > >>> for i in [0, 1]: f.append(lambda: i) > ... > >>> [g() for g in f] > [1, 1] > > >>> f = [] > >>> i = 0 > >>> f.append(lambda: i) > >>> i = 1 > >>> f.append(lambda: i) > >>> [g() for g in f] > [1, 1] > > > But the deal with loops is that we don't usually unroll them in our heads > when we reason about their behavior. We generally consider each iteration in > isolation, or as an abstract single iteration. (After all, it only appears > once in the program text.) If iterations need to depend on each other, it's > almost always done via something that already exists outside the loop. > > The semantics that best matches that kind of reasoning is fully scoped > loops. Python 4000 for scoped blocks? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Sat Oct 4 01:08:13 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 04 Oct 2008 11:08:13 +1200 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <20081003143145.GD14124@panix.com> References: <48E5F5F3.3040000@canterbury.ac.nz> <20081003143145.GD14124@panix.com> Message-ID: <48E6A5DD.4000901@canterbury.ac.nz> Aahz wrote: > You'll need to make sure that exceptions don't break this. Good point. I think that can be addressed by wrapping the final value copying in a finally block. That will ensure that the final value is always decoupled from anything captured by a nested function. -- Greg From bruce at leapyear.org Sat Oct 4 01:28:06 2008 From: bruce at leapyear.org (Bruce Leban) Date: Fri, 3 Oct 2008 16:28:06 -0700 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: References: <48E5F5F3.3040000@canterbury.ac.nz> <48E6A1EF.8000809@cs.byu.edu> Message-ID: I don't think the for loop is broken. If you want scopes other than global and function, then you should add that explicitly, maybe something like: a = [] for i in range(10): local k: k = i**2 a.append(lambda: k) # k is not accessible here a[3]() => 9 which is roughly equivalent to: a = [] for i in range(10): def local_k(): k = i**2 a.append(lambda: k) local_k() --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sat Oct 4 01:29:33 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 03 Oct 2008 19:29:33 -0400 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <48E5F5F3.3040000@canterbury.ac.nz> References: <48E5F5F3.3040000@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > There's been another discussion on c.l.py about > the problem of The behavior is a fact. Calling it a 'problem' is an opinion, one that I disagree with. So to me, your 'solution' is a solution to a non-problem. > lst = [] > for i in range(10): > lst.append(lambda: i) > for f in lst: > print f() > > printing 9 ten times instead of 0 to 9. If one understands that 'lambda: i' is essentially 'def f(): return i' and that code bodies are only executed when called, the behavior is obvious. > The usual response is to do > > lst.append(lambda i=i: i) Here are 5 more alternatives that have the same effect: (The 3rd is new since my c.l.p response) lst = [] for i in range(10): lst.append(eval("lambda: %d" %i)) lst = [] def f(i): return lambda: i for i in range(10): lst.append(f(i)) lst = [] def f(i): lst.append(lambda:i) for i in range(10): f(i) def populate(n): n -= 1 if n >= 0: return populate(n)+[lambda:n] else: return [] lst = populate(10) def populate(i,n,lst): if i < n: return populate(i+1,n,lst+[lambda:i]) else: return lst lst = populate(0,10,[]) > but this is not a very satisfying solution. To you. > For one thing, it's still abusing default arguments, Use is a fact, abuse is an opinion. > something that lexical scoping was supposed to have removed the > need for, Lexical scoping allows reading of variables that vary (get rebound). In 2.6/3.0, one can also write this from within the closure. This addition was anticipated from the beginning; it just took a while for Guido to decide on the syntax from among the 10-20 proposals. Modifying func.__defaults__ is much more awkward (I somehow thought it to be read-only, but in 3.0 it is not.) > and it won't work in some situations, such > as if the function needs to take a variable number of > arguments. So use another method. Are not 5 others enough? > Also, most other languages which have lexical scoping > and first-class functions don't seem to suffer from > problems like this. To someone familiar with one of > those languages (e.g. Scheme, Haskell) it looks as > if there's something broken about the way scoping of > nested functions works in Python. A respondant on c.l.p pointed out that Python works the same as C and Common Lisp. There are quite a few differences between Scheme/Haskell and Python. I believe neither is as widely known and used as Python. > So I'd like to propose something that would satisfy > both requirements: > > 0. There is no change if the loop variable is not > referenced by a nested function defined in the loop > body. The vast majority of loop code will therefore > be completely unaffected. > > 1. If the loop variable is referenced by such a nested > function, a new local scope is effectively created > to hold each successive value of the loop variable. As I understand this, you are proposing that for i in it: body be rewritten as def _(i): body for i in it: _(i) which is my third alternative above and only takes about 15 additional keystrokes, and only those are needed by an anti-default purist. Someone who wants this semantic should write it explicitly. I believe this sort of automagic would make Python even harder to learn and understand. One should be able to learn and use loops and simple functions before learning about nested functions and closures. If the loop is inside a function, as is typical for real code, and the loop body rebinds names outside the loop, then automagic addition of nonlocal declarations would be needed. > 2. Upon exiting the loop, the final value of the loop > variable is copied into the surrounding scope, for > use by code outside the loop body. My rewrite above does not require this. [snip] > The benefit would be that almost all code involving > loops and nested functions would behave intuitively, To you, perhaps, but not to all. > Python would free itself from any remaining perception > of having broken scope rules, That is a very idiosyncratic perception. > and we would finally be > able to consign the default-argument hack to the garbage > collector of history. By ruining the language? Just to save a few keystrokes? No thanks. -1000 (Overblown rhetoric meets overblown rhetoric ;-) Terry Jan Reedy From ntoronto at cs.byu.edu Sat Oct 4 01:48:12 2008 From: ntoronto at cs.byu.edu (Neil Toronto) Date: Fri, 03 Oct 2008 17:48:12 -0600 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <48E675A0.9090501@doxdesk.com> References: <48E5F5F3.3040000@canterbury.ac.nz> <48E675A0.9090501@doxdesk.com> Message-ID: <48E6AF3C.9000901@cs.byu.edu> Andrew Clover wrote: > Greg Ewing wrote: > > > 1. If the loop variable is referenced by such a nested > > function, a new local scope is effectively created > > to hold each successive value of the loop variable. > > Why only the loop variable? What about: > > >>> for i in range(3): > ... j= str(i) > ... funs.append(lambda: j) > >>> funs[0]() > '2' Spanking good point. To hack this "properly" all cell variables closed over within the loop would have to go into the per-iteration scope. > It seems an odd sort of scope that lets rebindings inside it fall > through outwards. True, but a lot of Python programs would depend on this - even new ones because of social inertia. >> Also, most other languages which have lexical scoping >> and first-class functions don't seem to suffer from >> problems like this. > > That's not always because of anything to do with introducing scopes > though. Some of those languages can bind the variable value early, so if > you were to write the equivalent of: > > >>> i= 3 > >>> f= lambda: i > >>> i= 4 > > f() would give you 3. I would love to see an early-value-binding > language with similar syntax to Python, but I can't see how to get there > from here. Python could close over the values rather than a cell. That's *almost always* what people really care about anyway! It'd break almost no code. Of course, to satisfy Paul Graham, there'd have to be a way to go back to cells. A "cell" keyword? Other languages make this explicit - they're often called "boxes". I wonder if "nonlocal" could go if there were a "cell" keyword... It wouldn't mean the variable carries pass-by-reference semantics, though. That would be bad. Neil From algorias at yahoo.com Sat Oct 4 01:42:23 2008 From: algorias at yahoo.com (Vitor Bosshard) Date: Fri, 3 Oct 2008 16:42:23 -0700 (PDT) Subject: [Python-ideas] if-syntax for regular for-loops Message-ID: <887847.72582.qm@web54409.mail.yahoo.com> > On Fri, Oct 3, 2008 at 12:33 PM, Andreas Nilsson wrote: > > Thanks for the pointer! > > I don't buy the argument that newlines automagically improves readability > > though. You also get increased nesting suggesting something interesting is > > happening where it isn't and that hurts readability. > > And as Vitor said, all other constructions of the form 'for i in items' can > > have if-conditions attached to them, it's really not that far-fetched to > > assume that the loop behaves the same way. Consistency good, surprises bad. > > Yeah, I know what you mean, and I kind of liked the idea of adding the > if statement to the for loop (for consistency, if nothing else), but > it's been discussed before, and plenty of people have made the same > argument. Probably not worth it. Besides consistency I think the one major benefit to for..if loops is that often you don't just save a line, but also an indentation level (whenever you use the if clause solely as a filter), which actually increases readability, specially when whatever you do within the loop is relatively long, with its own indentations. The syntax just feels natural. For example: for i in somelist if i.pending: I really don't see any disadvantage here. Vitor PS: moving the discussion from python-dev to python-ideas. ____________________________________________________________________________________ ?Todo sobre Amor y Sexo! La gu?a completa para tu vida en Mujer de Hoy. http://mujerdehoy.telemundo.yahoo.com/ From dillonco at comcast.net Sat Oct 4 02:16:02 2008 From: dillonco at comcast.net (Dillon Collins) Date: Fri, 3 Oct 2008 20:16:02 -0400 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: References: <48E5F5F3.3040000@canterbury.ac.nz> Message-ID: <200810032016.02899.dillonco@comcast.net> On Friday 03 October 2008, Terry Reedy wrote: > def f(i): return lambda: i > for i in range(10): > lst.append(f(i)) Or better yet: for i in range(10): lst.append((lambda i: lambda:i)(i)) But I'm probably not helping ;). I'd have to say, though, that if this was a problem, it'd have to be with lambda. Most people don't expect control blocks to have their own context, they only expect functions to have them, and then a 'global' one. Nested functions are awkward because they have their own context but can fall back to the parent if need be and people don't really see the sort of local-global aspect of closures. Also, how awful would the 'nonlocal' boilerplate be: count = 0 for i in lst: nonlocal count if i is not None: count += i And, unless I'm mistaken, this would make for loops incompatible with comprehensions: >>> a = [(lambda : i) for i in range(10)] >>> b = [] >>> for i in range(10): >>> b.append(lambda : i) >>> del i >>> print [j() for j in a] NameError: global name 'i' is not defined >>> print [j() for j in b] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] That's not good. From greg.ewing at canterbury.ac.nz Sat Oct 4 02:12:04 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 04 Oct 2008 12:12:04 +1200 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <48E675A0.9090501@doxdesk.com> References: <48E5F5F3.3040000@canterbury.ac.nz> <48E675A0.9090501@doxdesk.com> Message-ID: <48E6B4D4.9040600@canterbury.ac.nz> Andrew Clover wrote: > Why only the loop variable? What about: > > >>> for i in range(3): > ... j= str(i) > ... funs.append(lambda: j) My next step would be to propose a 'let' statement for dealing with things like that: for i in range(3): let j = str(i): funs.append(lambda: j) The 'let' statement would do the same thing for its variable as the for-loop, but in a one-off fashion. > It seems an odd sort of scope that lets rebindings inside it fall > through outwards. I think this is an unavoidable consequence of not having variable declarations. Otherwise it would be impossible for an assignment inside a for-loop to perform an ordinary rebinding of some local variable outside it. > Some of those languages can bind the variable value early, so if > you were to write the equivalent of: > > >>> i= 3 > >>> f= lambda: i > >>> i= 4 Can you give me an example of an imperative language that behaves that way? I don't think I've ever seen one. (Note that the above would be illegal in any functional (i.e. side-effect-free) language, since the syntax doesn't allow you to express rebinding an existing variable.) > There are other languages with lexical scope and late value binding, > such as JavaScript; their for loops behave the same as Python. Yes, and I would say that their for-loops are broken (or perhaps I should say suboptimally designed) in the same way. > >>> i= 0 > >>> geti= lambda: i > > >>> for i in [1]: > ... print i is geti() > True > > >>> for i in [1]: > ... dummy= lambda: i > ... print i is geti() > False Hm, there's something I left out of the specification. My intention was that the current value of the loop variable should *also* be seen from outside the loop while the loop is executing, so both of the above would print True. The implementation I suggested using cells has this property. > How about explicitly requesting to > be given a new scope each time around the loop? > > >>> for local i in range(3): > ... funs.append(lambda: i) That's a possibility, too. I actually proposed something like it once before, using 'new': for new i in range(3): ... However, it's quite an unexpected thing to have to do, so it would do nothing to reduce the frequency of questions on c.l.py about why people's lambdas are broken. It would provide a slightly more elegant answer to give them, though. One advantage would be that it could be extended to be usable on any assignment, not just for loops, so there wouldn't be a need for a separate 'let' statement. There would be some details to sort out, e.g. if you're unpacking into multiple variables, do you use just one 'new' for them all, or do you have to put 'new' in front of each one? I.e. for new a, b, c in stuff: ... or for new a, new b, new c in stuff: ... -- Greg From dillonco at comcast.net Sat Oct 4 03:00:22 2008 From: dillonco at comcast.net (Dillon Collins) Date: Fri, 3 Oct 2008 21:00:22 -0400 Subject: [Python-ideas] =?iso-8859-1?q?For-loop_variable_scope=3A_simultan?= =?iso-8859-1?q?eous_possession_and=09ingestion_of_cake?= In-Reply-To: <48E6B4D4.9040600@canterbury.ac.nz> References: <48E5F5F3.3040000@canterbury.ac.nz> <48E675A0.9090501@doxdesk.com> <48E6B4D4.9040600@canterbury.ac.nz> Message-ID: <200810032100.22300.dillonco@comcast.net> On Friday 03 October 2008, Greg Ewing wrote: > My next step would be to propose a 'let' statement > for dealing with things like that: > > for i in range(3): > let j = str(i): > funs.append(lambda: j) > > The 'let' statement would do the same thing for its > variable as the for-loop, but in a one-off fashion. Well now, that seems more than a little ridiculous. If we're going to be creating a keyword that rescopes variables, why not just use that instead of messing with the for loop. Seems like that would be a generally more useful solution anyway. Perhaps instead: for i in range(10): j = str(i) scope i, j: funs.append(lambda: (j,i)) > > Some of those languages can bind the variable value early, so if > > > > you were to write the equivalent of: > > >>> i= 3 > > >>> f= lambda: i > > >>> i= 4 > > Can you give me an example of an imperative language that > behaves that way? I don't think I've ever seen one. > > (Note that the above would be illegal in any functional > (i.e. side-effect-free) language, since the syntax doesn't > allow you to express rebinding an existing variable.) How about C? int i; int get(void) {return i;} int main() { i=3; get() i=4; } I think the confusion is the fact that the a local scope is effectively a _global_ scope in locally defined functions. It's really a rather conceptually elegant setup, though a tad confusing. Perhaps that's because we're used to seeing globals as "the global scope" rather than "all encompassing scopes". From bruce at leapyear.org Sat Oct 4 03:53:17 2008 From: bruce at leapyear.org (Bruce Leban) Date: Fri, 3 Oct 2008 18:53:17 -0700 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <200810032100.22300.dillonco@comcast.net> References: <48E5F5F3.3040000@canterbury.ac.nz> <48E675A0.9090501@doxdesk.com> <48E6B4D4.9040600@canterbury.ac.nz> <200810032100.22300.dillonco@comcast.net> Message-ID: On Fri, Oct 3, 2008 at 6:00 PM, Dillon Collins wrote: > On Friday 03 October 2008, Greg Ewing wrote: > > My next step would be to propose a 'let' statement > > for dealing with things like that: > > > > for i in range(3): > > let j = str(i): > > funs.append(lambda: j) > > > > Well now, that seems more than a little ridiculous. If we're going to be > creating a keyword that rescopes variables, why not just use that instead > of > messing with the for loop. I don't see how this is messing with the for loop at all. > Seems like that would be a generally more useful > solution anyway. > > Perhaps instead: > > for i in range(10): > j = str(i) > scope i, j: > funs.append(lambda: (j,i)) > The difference between my or Greg's proposal is that the scope of the variable is the inner block while your proposal the variable has two scopes. That is, to annotate the code: for i in range(10): # creates a new variable i in function scope (if it didn't already exist) local k: # creates a new scope for k k = i**2 # creates a new variable k (every time we execute this block) a.append(lambda: k) # closure references the variable in the enclosing scope (as it normally does) print(k) # error -- there is no k in this scope for i in range(10): # creates a new variable i in function scope (if it didn't already exist) k = i**2 # creates a new variable k in function scope (if it didn't already exist) scope k: # creates new variables k and copies the values from the outer scope a.append(lambda: k) # closure references the variable in the enclosing scope (as it normally does) print(k) # prints 81, the last k from the loop I don't care for the fact that there are really two k variables here (or more properly N+1). Also, the implicit copying sort of obscures some details. The fact that my alternative requires explicit setting of the value of the scoped variable is a good thing. For example, consider this: a = [0,1,2] for i in range(3): scope a: a.append(i) ... lambda: ... a ... Yes, a is in a different scope, but references the same list so the scope is useless. At least in my or Greg's version, since you have to assign to the local variable, there's a clear place where you can see the copying is missing. a = [0,1,2] for i in range(3): local b: b = a # easy to see that this should be b = copy.copy(a) b.append(i) ... lambda: ... b ... I've used my suggested syntax because I like it a bit better although the above would also apply to Greg's suggestion. Comparing our suggestions, I think local a,b: is a bit more clear than let a = value, b = value: and mine does not force you to initialize the variables. They could be combined: 'local' var [ '=' expr ] ( ',' var [ '=' expr ] )* ':' with the ',' in the local statement taking priority over ',' in the expressions just as it does for function calls. --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sat Oct 4 03:48:48 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 04 Oct 2008 13:48:48 +1200 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: References: <48E5F5F3.3040000@canterbury.ac.nz> Message-ID: <48E6CB80.2040207@canterbury.ac.nz> Terry Reedy wrote: > If one understands that 'lambda: i' is essentially 'def f(): return i' > and that code bodies are only executed when called, the behavior is > obvious. It's not just about lack of understanding -- even when you fully understand what's going on, you have to do something about it, and the available solutions are, to me, less than satisfying. Yes, that's an opinion. Most things in programming language design are. I'm discussing this to find whether anyone shares my opinion. > Here are 5 more alternatives that have the same effect: All of which are even worse, to my eyes. >> For one thing, it's still abusing default arguments, > > Use is a fact, abuse is an opinion. The reason I call it "abuse" is that the intended use of default argument values is as just that, a default to use if no other value is passed in. In this case there's no intention of passing a value in, so you're using the feature for something other than its intended purpose. What's more, if a value does happen to get passed in, it *breaks* what you're trying to do. So it only works in favourable circumstances, and isn't a general solution. >> as if the function needs to take a variable number of >> arguments. Usually it doesn't, but it could, if the API you're passing the function to requires it to. > A respondant on c.l.p pointed out that Python works the same as C and > Common Lisp. Yes, but... in Lisp or its derivatives, you *don't* normally write the equivalent of a for-loop by rebinding an existing control variable. You use a mapping function of some sort, in which the whole loop body is a lambda, and therefore receives a new binding for each loop value. This is the natural way to code in such languages. Most of the time you create new bindings rather than change existing ones, and this interacts well with nested functions. Python's assignment rules and lack of variable declarations, on the other hand, interact rather badly with nested functions. The most natural way of writing code often ends up rebinding where a new binding would be more appropriate. I'm suggesting a change to the for-loop because it's a place where, if it matters at all, a new binding is almost certainly what you want. To address the rest of the cases, there would be a 'let' statement or some such to introduce new bindings. > As I understand this, you are proposing that > > for i in it: > body > > be rewritten as > > def _(i): > body > for i in it: > _(i) Only conceptually. It would be unacceptably inefficient to actually implement it that way in current CPython. This translation isn't quite equivalent, because if the body assigns to i, in your version the change won't be seen from outside the loop during that iteration. In my version, it will. > I believe this sort of automagic would make Python even harder to learn > and understand. One should be able to learn and use loops and simple > functions before learning about nested functions and closures. Since loops without any nested functions would be completely unaffected, either conceptually or implementation-wise, I don't see how this would interfere with learning about loops before closures. -- Greg From dillonco at comcast.net Sat Oct 4 04:54:50 2008 From: dillonco at comcast.net (Dillon Collins) Date: Fri, 3 Oct 2008 22:54:50 -0400 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: References: <48E5F5F3.3040000@canterbury.ac.nz> <200810032100.22300.dillonco@comcast.net> Message-ID: <200810032254.50362.dillonco@comcast.net> On Friday 03 October 2008, Bruce Leban wrote: > On Fri, Oct 3, 2008 at 6:00 PM, Dillon Collins wrote: > > On Friday 03 October 2008, Greg Ewing wrote: > > > My next step would be to propose a 'let' statement > > > for dealing with things like that: > > > > > > for i in range(3): > > > let j = str(i): > > > funs.append(lambda: j) > > > > Well now, that seems more than a little ridiculous. If we're going to be > > creating a keyword that rescopes variables, why not just use that instead > > of > > messing with the for loop. > > I don't see how this is messing with the for loop at all. The original proposal is that the for loop create a local scope for it's variable and those created within it. That's what I was referring to; the let statement itself obviously doesn't. The point being that if we're going to make a scoping keyword, why add implicit scoping to for loops as well. > > > Seems like that would be a generally more useful > > solution anyway. > > > > Perhaps instead: > > > > for i in range(10): > > j = str(i) > > scope i, j: > > funs.append(lambda: (j,i)) > > The difference between my or Greg's proposal is that the scope of the > variable is the inner block while your proposal the variable has two > scopes. That is, to annotate the code: > I don't care for the fact that there are really two k variables here (or > more properly N+1). Also, the implicit copying sort of obscures some > details. The fact that my alternative requires explicit setting of the > value of the scoped variable is a good thing. If you look at the original proposal for this thread: "Upon exiting the loop, the final value of the loop variable is copied into the surrounding scope" So we are already dealing with at least one implicit copy. This is why I think that the loop scoping suggestion should basically be dropped in favor of some sort of scope block. > For example, consider this: > > a = [0,1,2] > for i in range(3): > scope a: > a.append(i) > ... lambda: ... a ... > > Yes, a is in a different scope, but references the same list so the scope > is useless. At least in my or Greg's version, since you have to assign to > the local variable, there's a clear place where you can see the copying is > missing. I consider that an advantage. Throughout all the rest of Python we deal with the concepts of variable assignments and mutable objects. Things like default arguments, global variables, and scopes (particularly prior to nonlocal). Why should scoping be any different? > 'local' var [ '=' expr ] ( ',' var [ '=' expr ] )* ':' While I like it conceptually, as I said above I don't really see the value in requiring assignment. If you did, you'd see "local a=a:" most of the time. Then, in a few months, someone will say that it's unnecessarily verbose/kind of confusing and post on this list a suggestion that "local a:" be equivalent to "local a=a:" From greg.ewing at canterbury.ac.nz Sat Oct 4 04:46:04 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 04 Oct 2008 14:46:04 +1200 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <48E6AF3C.9000901@cs.byu.edu> References: <48E5F5F3.3040000@canterbury.ac.nz> <48E675A0.9090501@doxdesk.com> <48E6AF3C.9000901@cs.byu.edu> Message-ID: <48E6D8EC.2040403@canterbury.ac.nz> Neil Toronto wrote: > Python could close over the values rather than a cell. That's *almost > always* what people really care about anyway! It'd break almost no code. No, it would break huge amounts of code. Consider def f(): g() def g(): print "Gee" Would you really like to get a NameError or UnboundLocalError on g here? -- Greg From greg.ewing at canterbury.ac.nz Sat Oct 4 04:56:21 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 04 Oct 2008 14:56:21 +1200 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: References: <48E5F5F3.3040000@canterbury.ac.nz> <48E6A1EF.8000809@cs.byu.edu> Message-ID: <48E6DB55.9010902@canterbury.ac.nz> Guido van Rossum wrote: > This leads me to reject claims that "the for-loop is broken" and in > particular clamoring for fixing the for-loop without allowing us to > fix this example. Yeah, I never said the for-loop was the *only* thing that's broken. :-) Perhaps "broken" is too strong a word. What I really mean is that it's designed in a way that interacts badly with nested functions. More generally, Python's inability to distinguish clearly between creating new bindings and changing existing bindings interacts badly with nested functions. I agree that the wider problem needs to be addressed somehow, and perhaps that should be the starting point. Depending on the solution adopted, we can then look at whether a change to the for-loop is still needed. -- Greg From greg.ewing at canterbury.ac.nz Sat Oct 4 05:00:35 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 04 Oct 2008 15:00:35 +1200 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <200810032016.02899.dillonco@comcast.net> References: <48E5F5F3.3040000@canterbury.ac.nz> <200810032016.02899.dillonco@comcast.net> Message-ID: <48E6DC53.5050506@canterbury.ac.nz> Dillon Collins wrote: > And, unless I'm mistaken, this would make for loops incompatible with > comprehensions: No, the same thing would be done in list comprehensions as well (except for preserving the final value, since LC variables no longer leak). -- Greg From tjreedy at udel.edu Sat Oct 4 05:08:28 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 03 Oct 2008 23:08:28 -0400 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <48E6CB80.2040207@canterbury.ac.nz> References: <48E5F5F3.3040000@canterbury.ac.nz> <48E6CB80.2040207@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > Terry Reedy wrote: > >> Here are 5 more alternatives that have the same effect: > > All of which are even worse, to my eyes. If you mean worse than the simple default arg hack, I could be persuaded. But the default arg issue is a red herring. There is no need for what you call abuse, as I amply demonstrated, and there is no way for you to stop others from (mis)using them without removing them. If you mean worse than turning for-loops into an esoteric CS monster (my view), we disagree. My deeper objection is this. Your intended-to-be-motivating example, similar to what others have occasionally posted, is a toy snippet that illustrates some points of Python behavior, but which I see no use for in real application code. Given def f(i): return i; your lst[i]() is equivalent to f(i). So just write and use the function. OK, to be persnickety, we need more code, of about the same length as needed to generate lst: def f(i): if not isinstance(i,int): raise TypeError("requires int i") if not -10 <=i <10: raise ValueError("requires -10 <= i < 10") return i So, as near as I can see, your list of identical functions with variant closure cells simulates type and range checking. What is the point of that? Perhaps you have an undisclosed real use case with much more complicated closures. It would still be true that the array index could instead to fed to the function as an arg. If there were really a reason not to do that, the 15 keystroke overhead would be relatively much smaller for more complicated (and realistic) closures. Terry Jan Reedy From greg.ewing at canterbury.ac.nz Sat Oct 4 05:13:26 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 04 Oct 2008 15:13:26 +1200 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <200810032100.22300.dillonco@comcast.net> References: <48E5F5F3.3040000@canterbury.ac.nz> <48E675A0.9090501@doxdesk.com> <48E6B4D4.9040600@canterbury.ac.nz> <200810032100.22300.dillonco@comcast.net> Message-ID: <48E6DF56.4070601@canterbury.ac.nz> Dillon Collins wrote: >> for i in range(3): >> let j = str(i): >> funs.append(lambda: j) > Well now, that seems more than a little ridiculous. I don't think someone coming from Lisp, Scheme or Haskell would think it ridiculous. The 'let' statement will be instantly recognisable to them -- unlike your 'scope' statement, which will be familiar to nobody. It's true that with a 'let' statement or equivalent, there's no strict need for a change to the for-loop, since you can always say for i in range(10): let i = i: funcs.append(lambda: i) But it's an annoying and odd-looking piece of boilerplate to have to use, and in that respect is similar to the existing solutions of inserting another lambda or using a default argument value. So as a *convenience*, I'm suggesting that the for-loop be given automatic let-like behaviour. > How about C? > > int i; > int get(void) {return i;} > > int main() > { > i=3; > get() > i=4; > } No, that's not the same thing at all. You're not creating a closure when i==3 and then calling it after i=4; you're calling the function while i is still 3. The claim was that there exist side-effectful languages with closures that close over values instead of variables. C can't be one of those, because it doesn't even have closures. -- Greg From greg.ewing at canterbury.ac.nz Sat Oct 4 05:25:51 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 04 Oct 2008 15:25:51 +1200 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: References: <48E5F5F3.3040000@canterbury.ac.nz> <48E675A0.9090501@doxdesk.com> <48E6B4D4.9040600@canterbury.ac.nz> <200810032100.22300.dillonco@comcast.net> Message-ID: <48E6E23F.1020602@canterbury.ac.nz> Bruce Leban wrote: > Comparing our > suggestions, I think local a,b: is a bit more clear than let a = value, > b = value: and mine does not force you to initialize the variables. I tend to favour 'let' because it has a long history of use in other languages for a very similar purpose, and will be instantly recognisable to anyone familiar with those languages. But it's really only a choice of keyword. If allowing for non-initialization is desirable, it could be permitted to say let a: ... -- Greg From greg.ewing at canterbury.ac.nz Sat Oct 4 06:12:15 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 04 Oct 2008 16:12:15 +1200 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: References: <48E5F5F3.3040000@canterbury.ac.nz> <48E6CB80.2040207@canterbury.ac.nz> Message-ID: <48E6ED1F.4040207@canterbury.ac.nz> Terry Reedy wrote: > Your intended-to-be-motivating example ... is a toy snippet that > illustrates some points of Python behavior, but which I see no use for > in real application code. My example wasn't intended to prove the existence of the problem, only refer to an already-acknowledged one. Its existence is attested by the fact that people regularly get tripped up by it. Here's a more realistic example: menu_items = [ ("New Game", 'new'), ("Resume", 'resume'), ("Quit", 'quit') ] buttons = [] for title, action in menu_items: buttons.append(Button(title, lambda: getattr(game, action)())) which gives you three buttons that all execute the 'quit' action. -- Greg Given def f(i): return i; your lst[i]() is > equivalent to f(i). So just write and use the function. > > OK, to be persnickety, we need more code, of about the same length as > needed to generate lst: > > def f(i): > if not isinstance(i,int): raise TypeError("requires int i") > if not -10 <=i <10: raise ValueError("requires -10 <= i < 10") > return i > > So, as near as I can see, your list of identical functions with variant > closure cells simulates type and range checking. What is the point of > that? > > Perhaps you have an undisclosed real use case with much more complicated > closures. It would still be true that the array index could instead to > fed to the function as an arg. If there were really a reason not to do > that, the 15 keystroke overhead would be relatively much smaller for > more complicated (and realistic) closures. > > Terry Jan Reedy > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From arnodel at googlemail.com Sat Oct 4 09:25:54 2008 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Sat, 4 Oct 2008 08:25:54 +0100 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <48E6ED1F.4040207@canterbury.ac.nz> References: <48E5F5F3.3040000@canterbury.ac.nz> <48E6CB80.2040207@canterbury.ac.nz> <48E6ED1F.4040207@canterbury.ac.nz> Message-ID: On 4 Oct 2008, at 05:12, Greg Ewing wrote: > Terry Reedy wrote: > >> Your intended-to-be-motivating example ... is a toy snippet that >> illustrates some points of Python behavior, but which I see no use >> for in real application code. > > My example wasn't intended to prove the existence of the > problem, only refer to an already-acknowledged one. Its > existence is attested by the fact that people regularly > get tripped up by it. > > Here's a more realistic example: > > menu_items = [ > ("New Game", 'new'), > ("Resume", 'resume'), > ("Quit", 'quit') > ] > > buttons = [] > for title, action in menu_items: > buttons.append(Button(title, lambda: getattr(game, action)())) > Isn't this better as: buttons.append(Button(title, getattr(game, action))) Unless you want late binding of 'game', but that would be confusing. > which gives you three buttons that all execute the > 'quit' action. -- Arnaud From greg.ewing at canterbury.ac.nz Sat Oct 4 12:41:44 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 04 Oct 2008 22:41:44 +1200 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: References: <48E5F5F3.3040000@canterbury.ac.nz> <48E6CB80.2040207@canterbury.ac.nz> <48E6ED1F.4040207@canterbury.ac.nz> Message-ID: <48E74868.6080104@canterbury.ac.nz> Arnaud Delobelle wrote: > Isn't this better as: > > buttons.append(Button(title, getattr(game, action))) > > Unless you want late binding of 'game', Well, you might, for example if you implement restoring a saved game by unpickling a Game object and assigning it to game. There's always some way to rearrange it so that it works, but the point is that it's easy to write things like this that don't work, unless you really keep your wits about you. -- Greg From tjreedy at udel.edu Sat Oct 4 14:18:23 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 04 Oct 2008 08:18:23 -0400 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <48E6D8EC.2040403@canterbury.ac.nz> References: <48E5F5F3.3040000@canterbury.ac.nz> <48E675A0.9090501@doxdesk.com> <48E6AF3C.9000901@cs.byu.edu> <48E6D8EC.2040403@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > Neil Toronto wrote: > >> Python could close over the values rather than a cell. That's *almost >> always* what people really care about anyway! It'd break almost no code. > > No, it would break huge amounts of code. Consider Something we seem to agree on ... > > def f(): > g() > > def g(): > print "Gee" > > Would you really like to get a NameError or > UnboundLocalError on g here? I believe all (apparently) recursive functions would have the same problem. From tjreedy at udel.edu Sat Oct 4 16:04:36 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 04 Oct 2008 10:04:36 -0400 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <48E74868.6080104@canterbury.ac.nz> References: <48E5F5F3.3040000@canterbury.ac.nz> <48E6CB80.2040207@canterbury.ac.nz> <48E6ED1F.4040207@canterbury.ac.nz> <48E74868.6080104@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > Arnaud Delobelle wrote: > >> Isn't this better as: >> >> buttons.append(Button(title, getattr(game, action))) >> >> Unless you want late binding of 'game', > > Well, you might, for example if you implement restoring a > saved game by unpickling a Game object and assigning it > to game. To me, this example and comment proves my point. If you want 'action' interpolated immediately, to not actually be a variable of each function, while you want 'game' left to be a true free variable, then overtly say so specifically in one way or another without magic. In my opinion, the 'evil default arg hack' does this nicely (though not completely), especially if a new name is used for the lambda local. lambda a=action: getattr(game,a) This is only 4 extra keystrokes. If the function is left anonymous and only called by clicking a button, there is no danger of a 'user' accidentally calling the function with an extra arg that overrides the default. This fact to me eliminates the main objection to the usage. If one still does not like that for whatever reason, complete value interpolation is nicely done by eval("lambda: getattr(game,%s)" % action) That is 13 extra spaces, including 2 spaces for easier reading. > There's always some way to rearrange it so that it works, Also my point. > but the point is that it's easy to write things like this > that don't work, unless you really keep your wits about > you. It is also easy to write things that don't work, unless you really keep your wits about you, with lists and other mutables, and with floats ;-). Terry Jan Reedy From tjreedy at udel.edu Sat Oct 4 16:18:41 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 04 Oct 2008 10:18:41 -0400 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <48E6DF56.4070601@canterbury.ac.nz> References: <48E5F5F3.3040000@canterbury.ac.nz> <48E675A0.9090501@doxdesk.com> <48E6B4D4.9040600@canterbury.ac.nz> <200810032100.22300.dillonco@comcast.net> <48E6DF56.4070601@canterbury.ac.nz> Message-ID: Greg Ewing wrote: tement, which will be familiar to nobody. > > It's true that with a 'let' statement or equivalent, > there's no strict need for a change to the for-loop, > since you can always say > > for i in range(10): > let i = i: > funcs.append(lambda: i) > > But it's an annoying and odd-looking piece of > boilerplate to have to use, and in that respect is > similar to the existing solutions of inserting another > lambda or using a default argument value. > > So as a *convenience*, I'm suggesting that the > for-loop be given automatic let-like behaviour. Whereas I consider the proposed automaticity to be a grave inconvenience and confusion factor. What if I *want* a closure to be over variables, as normal, instead of values. It seems to me that what you want is fine-grained control over scoping, or something like that. I would prefer that you overtly propose and argue for some syntax to do that explicitly, instead of sneaking something implicit into for-loops. Or perhaps a different proposal: Def statements close over variables, as currently. Lambda expression close over values, as some people seem to expect them to. This expectation seems to be the crux of the purported 'problem'. This change would also deal with Guido's example. Terry Jan Reedy From george.sakkis at gmail.com Sat Oct 4 16:42:20 2008 From: george.sakkis at gmail.com (George Sakkis) Date: Sat, 4 Oct 2008 10:42:20 -0400 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: References: <48E5F5F3.3040000@canterbury.ac.nz> <48E675A0.9090501@doxdesk.com> <48E6B4D4.9040600@canterbury.ac.nz> <200810032100.22300.dillonco@comcast.net> <48E6DF56.4070601@canterbury.ac.nz> Message-ID: <91ad5bf80810040742g2896758dg54b6680feadca260@mail.gmail.com> On Sat, Oct 4, 2008 at 10:18 AM, Terry Reedy wrote: Greg Ewing wrote: > tement, which will be familiar to nobody. > >> >> It's true that with a 'let' statement or equivalent, >> there's no strict need for a change to the for-loop, >> since you can always say >> >> for i in range(10): >> let i = i: >> funcs.append(lambda: i) >> >> But it's an annoying and odd-looking piece of >> boilerplate to have to use, and in that respect is >> similar to the existing solutions of inserting another >> lambda or using a default argument value. >> >> So as a *convenience*, I'm suggesting that the >> for-loop be given automatic let-like behaviour. >> > > Whereas I consider the proposed automaticity to be a grave inconvenience > and confusion factor. What if I *want* a closure to be over variables, as > normal, instead of values. Why would you want that for a loop variable ? Can you give an example where this would be the desired behavior ? George -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillonco at comcast.net Sat Oct 4 21:00:59 2008 From: dillonco at comcast.net (Dillon Collins) Date: Sat, 4 Oct 2008 15:00:59 -0400 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <48E6DF56.4070601@canterbury.ac.nz> References: <48E5F5F3.3040000@canterbury.ac.nz> <200810032100.22300.dillonco@comcast.net> <48E6DF56.4070601@canterbury.ac.nz> Message-ID: <200810041500.59662.dillonco@comcast.net> On Friday 03 October 2008, Greg Ewing wrote: > It's true that with a 'let' statement or equivalent, > there's no strict need for a change to the for-loop, > since you can always say > > for i in range(10): > let i = i: > funcs.append(lambda: i) > > But it's an annoying and odd-looking piece of > boilerplate to have to use, and in that respect is > similar to the existing solutions of inserting another > lambda or using a default argument value. > > So as a *convenience*, I'm suggesting that the > for-loop be given automatic let-like behaviour. For what percentage of your for loops will this matter? For me, 0, though I did have one IIRC before I rewrote it. I imagine most people are no different. My point is that it's a lot of added complexity, and possible bugs, gotchas, and performance penalties for an extremely basic part of the language. If we need to have another keyword anyway, why do we need to deal with all that just to save a line in a handful of loops? > The claim was that there exist side-effectful languages > with closures that close over values instead of variables. > C can't be one of those, because it doesn't even have > closures. Ah, my apologies; I seem to have totally spaced reading the relevant part of the GP. From leif.walsh at gmail.com Sat Oct 4 21:49:58 2008 From: leif.walsh at gmail.com (Leif Walsh) Date: Sat, 4 Oct 2008 15:49:58 -0400 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <48E6E23F.1020602@canterbury.ac.nz> References: <48E5F5F3.3040000@canterbury.ac.nz> <48E675A0.9090501@doxdesk.com> <48E6B4D4.9040600@canterbury.ac.nz> <200810032100.22300.dillonco@comcast.net> <48E6E23F.1020602@canterbury.ac.nz> Message-ID: On Fri, Oct 3, 2008 at 11:25 PM, Greg Ewing wrote: > I tend to favour 'let' because it has a long history > of use in other languages for a very similar purpose, > and will be instantly recognisable to anyone familiar > with those languages. If I might suggest that the legacy way is not necessarily the right way, 'let' feels kind of like a variable declaration, while 'scope' or something else (I think I saw 'local' somewhere, which looked okay) would hopefully give more of a notion that there is something deeper going on. I am mostly wary of new users discovering 'let' and using it everywhere because they don't understand what it means. -- Cheers, Leif From bruce at leapyear.org Sat Oct 4 23:53:45 2008 From: bruce at leapyear.org (Bruce Leban) Date: Sat, 4 Oct 2008 14:53:45 -0700 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: References: <48E5F5F3.3040000@canterbury.ac.nz> <48E675A0.9090501@doxdesk.com> <48E6B4D4.9040600@canterbury.ac.nz> <200810032100.22300.dillonco@comcast.net> <48E6E23F.1020602@canterbury.ac.nz> Message-ID: Let me widen the scope of the discussion. I think it's a bit strange that the with statement doesn't have a scope. That is: with f() as x: body # x is still defined here Is this useful? To my thought it would make more sense if it introduced a scope. --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sun Oct 5 02:38:12 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 05 Oct 2008 13:38:12 +1300 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: References: <48E5F5F3.3040000@canterbury.ac.nz> <48E675A0.9090501@doxdesk.com> <48E6B4D4.9040600@canterbury.ac.nz> <200810032100.22300.dillonco@comcast.net> <48E6DF56.4070601@canterbury.ac.nz> Message-ID: <48E80C74.1010603@canterbury.ac.nz> Terry Reedy wrote: > Or perhaps a different proposal: > Def statements close over variables, as currently. > Lambda expression close over values, as some people seem to expect them > to. I don't think it would be a good idea to make defs and lambdas have different behaviour, because they're not interchangeable otherwise. If you're forced to use a def because what you want can't be done in a lambda, you'd be forced to accept different binding behaviour as well. Also, I'm not convinced that people think of defs and lambdas as differing like that. I suspect that they would have the same expectation if a def were written inside the loop instead of a lambda. -- Greg From greg.ewing at canterbury.ac.nz Sun Oct 5 02:40:53 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 05 Oct 2008 13:40:53 +1300 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: References: <48E5F5F3.3040000@canterbury.ac.nz> <48E675A0.9090501@doxdesk.com> <48E6B4D4.9040600@canterbury.ac.nz> <200810032100.22300.dillonco@comcast.net> <48E6DF56.4070601@canterbury.ac.nz> Message-ID: <48E80D15.5010205@canterbury.ac.nz> Terry Reedy wrote: > What if I *want* a closure to be over variables, > as normal, instead of values. You can always assign the loop value to another variable inside the loop. Or use some other approach, such as a while loop. -- Greg From greg.ewing at canterbury.ac.nz Sun Oct 5 03:04:40 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 05 Oct 2008 14:04:40 +1300 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: References: <48E5F5F3.3040000@canterbury.ac.nz> <48E675A0.9090501@doxdesk.com> <48E6B4D4.9040600@canterbury.ac.nz> <200810032100.22300.dillonco@comcast.net> <48E6E23F.1020602@canterbury.ac.nz> Message-ID: <48E812A8.2090000@canterbury.ac.nz> Leif Walsh wrote: > If I might suggest that the legacy way is not necessarily the right > way, 'let' feels kind of like a variable declaration It *is* a variable declaration. If you're saying it's not *only* a variable declaration, that's true, but it's also true in other languages that have 'let'. > I am mostly wary of new users discovering 'let' and using > it everywhere because they don't understand what it means. It wouldn't do much harm if they did -- there's no semantic or performance difference if the variable isn't referenced from a nested function. It could even be considered helpful if it makes it clear that a particular temporary variable is only used in a certain region of the code. -- Greg From rrr at ronadam.com Sun Oct 5 15:21:50 2008 From: rrr at ronadam.com (Ron Adam) Date: Sun, 05 Oct 2008 08:21:50 -0500 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <48E6DF56.4070601@canterbury.ac.nz> References: <48E5F5F3.3040000@canterbury.ac.nz> <48E675A0.9090501@doxdesk.com> <48E6B4D4.9040600@canterbury.ac.nz> <200810032100.22300.dillonco@comcast.net> <48E6DF56.4070601@canterbury.ac.nz> Message-ID: <48E8BF6E.6010602@ronadam.com> Greg Ewing wrote: > It's true that with a 'let' statement or equivalent, > there's no strict need for a change to the for-loop, > since you can always say > > for i in range(10): > let i = i: > funcs.append(lambda: i) > > But it's an annoying and odd-looking piece of > boilerplate to have to use, and in that respect is > similar to the existing solutions of inserting another > lambda or using a default argument value. Seems to me this is very similar to decorators. >>> L = [] >>> def get_f(i): ... return lambda:i ... >>> for i in range(10): ... L.append(get_f(i)) ... >>> L [ at 0x7f6598375320>, at 0x7f6598375398>, at 0x7f6598375410>, at 0x7f6598375488>, at 0x7f6598375500>, at 0x7f6598375578>, at 0x7f65983755f0>, at 0x7f6598375668>, at 0x7f65983756e0>, at 0x7f6598375758>] >>> for f in L: ... f() ... 0 1 2 3 4 5 6 7 8 9 Ron From carl at carlsensei.com Mon Oct 6 04:43:22 2008 From: carl at carlsensei.com (Carl Johnson) Date: Sun, 5 Oct 2008 16:43:22 -1000 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake Message-ID: <2F8D19E9-3A84-4B9A-A956-995A0E60F765@carlsensei.com> Why does the evil default args hack work? Because it immediately evaluates the argument and stores the result into the lambda object's default values list. So, what we need is a keyword that means "immediately evaluate the argument and store the result, instead of looking up the name again later." Since I can't think of a good name for this, I will use a terrible name for it, "immanentize." >>> i = 1 >>> def f(): ... return immanentize i ... >>> i = 2 >>> f() 1 >>> lst = [] >>> for i in range(10): ... lst.append(lambda: immanentize i) ... >>> [f() for f in _] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> def side_effect_maker(): ... print("blah") ... return datetime.datetime.now() ... >>> def f(): ... creation_time = immanentize side_effect_maker() ... return datetime.datetime.now() - creation_time ... "blah" >>> f() #Assume it takes you 5 seconds to type "f()" and press enter datetime.timedelta(0, 5, 0) Etc. Behind the scenes, this would look like syntatic sugar for something like creating a new variable name, evaluating the expression at initial compile time, setting the variable name to be the result of the evaluation, and replacing the immanentize expression with the variable name. Like this: >>> lst = [] >>> for i in range(10): ... random_variable_name_0..9 = i ... # ie. on the first loop around random_variable_name_0 = 0, on the second loop random_variable_name_1 = 1, etc. ... lst.append(lambda: random_variable_name_0..9) ... # ie. the first loop is evaluated as lst.append(lambda(random_variable_name_0)), etc. ... >>> [f() for f in _] [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> def side_effect_maker(): ... print("blah") ... return datetime.datetime.now() ... >>> random_variable_name_number_0 = side_effect_maker() "blah" >>> def f(): ... creation_time = random_variable_name_number_0 ... return datetime.datetime.now() - creation_time I think this proposal is better than using "local:" or "scope:" because it doesn't create nested blocks when you just want to freeze out one particular value. One interesting side effect of having an immanentize keyword is that in Python 4000, we could (if we wanted to) get rid of the supposed "wart" of having x=[] as a default arg leading to unexpected results for Python newbies. Just make it so that to get the current behavior you type >>> def f(x=immanentize []): ... x.append(1) ... return x ... >>> f() [1] >>> f() [1, 1] Whereas, without immanentize we can do what newbies expect and evaluate the defaults afresh each time. >>> def f(x=[]): ... x.append(1) ... return x ... >>> f() [1] >>> f() [1] Obviously, "immanentize" is a terrible name, and only just barely a word of English. Perhaps we could call it an "anonymous variable" instead? -- Carl From carl at carlsensei.com Mon Oct 6 04:56:04 2008 From: carl at carlsensei.com (Carl Johnson) Date: Sun, 5 Oct 2008 16:56:04 -1000 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <2F8D19E9-3A84-4B9A-A956-995A0E60F765@carlsensei.com> References: <2F8D19E9-3A84-4B9A-A956-995A0E60F765@carlsensei.com> Message-ID: Oh, I thought of something. What if you try to immanentize a variable in function that only exists in the function's scope? I think that should cause an error to be raised. So, this should work: >>> j = 5 >>> def f(): ... immanentize print(j) ... 5 >>> But this should not: >>> def f(): ... j = 5 ... immanentize print(j) ... Traceback (most recent call last): File "", line 1, in File "", line 3, in f ImmanentizingNameError: name 'j' is not defined in the current scope -- Carl From bruce at leapyear.org Mon Oct 6 06:47:27 2008 From: bruce at leapyear.org (Bruce Leban) Date: Sun, 5 Oct 2008 21:47:27 -0700 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <2F8D19E9-3A84-4B9A-A956-995A0E60F765@carlsensei.com> References: <2F8D19E9-3A84-4B9A-A956-995A0E60F765@carlsensei.com> Message-ID: Um, I think this is more complicated. Consider: i = 0 def f(): i += 1 return lambda: i Now lambda: i is bound to i so every time I call f it will return a function that returns the current value of i not the one at the time f was called. So I can fix this by immanetization: i = 0 def f(): i += 1 return lambda: immanentize i This will return lambda: 1, then lambda: 2, etc. right? No. It returns lambda: 0, lambda: 0, etc. --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at carlsensei.com Mon Oct 6 07:16:46 2008 From: carl at carlsensei.com (Carl Johnson) Date: Sun, 5 Oct 2008 19:16:46 -1000 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: References: <2F8D19E9-3A84-4B9A-A956-995A0E60F765@carlsensei.com> Message-ID: <34C0BC0E-DD4E-4142-8AA8-53B2CF23D6FA@carlsensei.com> On 2008/10/05, at 6:47 pm, Bruce Leban wrote: > Um, I think this is more complicated. Consider: > > i = 0 > def f(): > i += 1 > return lambda: i > I'm not sure I see what you're getting at. In Python 2.6 and 3.0rc1 this raises "UnboundLocalError: local variable 'i' referenced before assignment." If you want to do what it looks like you want to do, you have to use "nonlocal i" or "global i". > Now lambda: i is bound to i so every time I call f it will return a > function that returns the current value of i not the one at the time > f was called. So I can fix this by immanetization: > > i = 0 > def f(): > i += 1 > return lambda: immanentize i > > This will return lambda: 1, then lambda: 2, etc. right? No. It > returns lambda: 0, lambda: 0, etc. To me, it is transparently clear that this will return lambda: 0 every time. That's what immanentization does. If you want lambda: 1, etc., use "nonlocal i". -- Carl From carl at carlsensei.com Mon Oct 6 08:26:59 2008 From: carl at carlsensei.com (Carl Johnson) Date: Sun, 5 Oct 2008 20:26:59 -1000 Subject: [Python-ideas] Nonlocal oddness In-Reply-To: <34C0BC0E-DD4E-4142-8AA8-53B2CF23D6FA@carlsensei.com> References: <2F8D19E9-3A84-4B9A-A956-995A0E60F765@carlsensei.com> <34C0BC0E-DD4E-4142-8AA8-53B2CF23D6FA@carlsensei.com> Message-ID: <7189B80C-FE58-40A4-8C42-4C7D01812E6C@carlsensei.com> I noticed something based on some code in the other thread, but it's not really related to it. Is there a reason for this not working: >>> i = 0 >>> def f(): ... nonlocal i ... i += 1 ... return lambda: i ... SyntaxError: no binding for nonlocal 'i' found Versus: >>> i = 0 >>> def f(): ... global i ... i += 1 ... return lambda: i ... >>> This was probably already discussed at the time "nonlocal" was invented, but is there a specific reason that "nonlocal" can't be used in cases where the next scope out is the same as "global"? I naively assumed that you could use them almost interchangeably if you were at the top level of a module. ("Almost" because "global" adds the variable to the module namespace if it's not already there, whereas "nonlocal" doesn't blithely add variables to other scopes, but just goes looking for existing ones.) Why force me to switch to "global" when I cut and paste a function out of a class or whatever and put it at the top level of my module? Is it just so that TOOOWTDI? Or was it an oversight? Or is there some other reason? Thanks, -- Carl From brett at python.org Mon Oct 6 08:44:40 2008 From: brett at python.org (Brett Cannon) Date: Sun, 5 Oct 2008 23:44:40 -0700 Subject: [Python-ideas] Nonlocal oddness In-Reply-To: <7189B80C-FE58-40A4-8C42-4C7D01812E6C@carlsensei.com> References: <2F8D19E9-3A84-4B9A-A956-995A0E60F765@carlsensei.com> <34C0BC0E-DD4E-4142-8AA8-53B2CF23D6FA@carlsensei.com> <7189B80C-FE58-40A4-8C42-4C7D01812E6C@carlsensei.com> Message-ID: On Sun, Oct 5, 2008 at 11:26 PM, Carl Johnson wrote: > I noticed something based on some code in the other thread, but it's not > really related to it. Is there a reason for this not working: > >>>> i = 0 >>>> def f(): > ... nonlocal i > ... i += 1 > ... return lambda: i > ... > SyntaxError: no binding for nonlocal 'i' found > > Versus: > >>>> i = 0 >>>> def f(): > ... global i > ... i += 1 > ... return lambda: i > ... >>>> > > This was probably already discussed at the time "nonlocal" was invented, but > is there a specific reason that "nonlocal" can't be used in cases where the > next scope out is the same as "global"? Because nonlocal is not global. The whole point of nonlocal is it falls between global and global. > I naively assumed that you could use > them almost interchangeably if you were at the top level of a module. > ("Almost" because "global" adds the variable to the module namespace if it's > not already there, whereas "nonlocal" doesn't blithely add variables to > other scopes, but just goes looking for existing ones.) Why force me to > switch to "global" when I cut and paste a function out of a class or > whatever and put it at the top level of my module? Is it just so that > TOOOWTDI? Or was it an oversight? Or is there some other reason? > You said it already: "cut and paste". Having nonlocal != global helps catch some potential bugs. Plus you just don't gain anything worth losing that assistance in finding coding errors by having nonlocal act like global. -Brett From bruce at leapyear.org Mon Oct 6 09:08:46 2008 From: bruce at leapyear.org (Bruce Leban) Date: Mon, 6 Oct 2008 00:08:46 -0700 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <34C0BC0E-DD4E-4142-8AA8-53B2CF23D6FA@carlsensei.com> References: <2F8D19E9-3A84-4B9A-A956-995A0E60F765@carlsensei.com> <34C0BC0E-DD4E-4142-8AA8-53B2CF23D6FA@carlsensei.com> Message-ID: On Sun, Oct 5, 2008 at 10:16 PM, Carl Johnson wrote: > On 2008/10/05, at 6:47 pm, Bruce Leban wrote: > > i = 0 >> def f(): >> i += 1 >> return lambda: i >> >> > I'm not sure I see what you're getting at. In Python 2.6 and 3.0rc1 this > raises "UnboundLocalError: local variable 'i' referenced before assignment." > If you want to do what it looks like you want to do, you have to use > "nonlocal i" or "global i". > Yup. I wrote that a bit too quickly. > > >> i = 0 >> def f(): >> i += 1 >> return lambda: immanentize i >> >> This will return lambda: 1, then lambda: 2, etc. right? No. It returns >> lambda: 0, lambda: 0, etc. >> > > To me, it is transparently clear that this will return lambda: 0 every > time. That's what immanentization does. If you want lambda: 1, etc., use > "nonlocal i". > > Consider this: i = 0 def f(): global i i += 1 return lambda: immanentize 1 when does immanentize get evaluated? when f is defined or when the lambda is evaluated? From what you wrote, it sounds like you think it's evaluated when f is defined. OK, so how do I get the equivalent of: def f(): global i i += 1 lambda i=i: i using immanentize? -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at carlsensei.com Mon Oct 6 09:53:06 2008 From: carl at carlsensei.com (Carl Johnson) Date: Sun, 5 Oct 2008 21:53:06 -1000 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: References: <2F8D19E9-3A84-4B9A-A956-995A0E60F765@carlsensei.com> <34C0BC0E-DD4E-4142-8AA8-53B2CF23D6FA@carlsensei.com> Message-ID: <7E7E63F2-8CD1-4716-9E69-CBE44BBB36A7@carlsensei.com> On 2008/10/05, at 9:08 pm, Bruce Leban wrote: > Consider this: > > i = 0 > def f(): > global i > i += 1 > return lambda: immanentize 1 > > when does immanentize get evaluated? when f is defined or when the > lambda is evaluated? From what you wrote, it sounds like you think > it's evaluated when f is defined. OK, so how do I get the equivalent > of: > > def f(): > global i > i += 1 > lambda i=i: i > > using immanentize? OK, now I see what you're getting at. That makes more sense. The question is how do we deal with nested scopes with an immanentize in the innermost scope. Off the top of my head, I think the most sensible way to do it is that the immanentization happens when the innermost scope it's in is turned into a real function by the next scope out. But I may need to do some more thinking about what would happen here: def f(): xs = [] for i in range(10): xs.append(lambda: immanentize i) return xs Clearly, we want this immanentize to be held off on until f is finally called. The more I think about it, the more I realize that an immanentize always needs to be inside of some kind of function declaration, whether it's a def or a lambda or (maybe) a class (but I haven't thought about the details of that yet?), and evaluated just one level "up" from that. >>> def side_effector(): ... r = random.randint(0, 10) ... print(r) ... return r ... >>> def f(): ... def g(): ... return immanentize side_effector() ... return (immanentize side_effector()), g 5 >>> a, b = f() 2 >>> a 5 >>> b() 2 >>> b() 2 >>> a, b = f() 3 >>> a 5 >>> b() 3 In which case, for your original example, the immanentization wouldn't happen until f is called. >>> i = 0 >>> def f(): ... global i ... i += 1 ... return lambda: immanentize i ... >>> f() at 0x3a57970> >>> g = _ >>> f() at 0x3a58240> >>> k = _ >>> k() 2 >>> g() 1 I think that makes sense, right? :/ Does anyone know how Lisp does this? I know that they write '(blah) to delay evaluation, but what do they do when they want to force evaluation someplace in the middle of a non-evaluated list? -- Carl From jan.kanis at phil.uu.nl Mon Oct 6 12:45:27 2008 From: jan.kanis at phil.uu.nl (Jan Kanis) Date: Mon, 6 Oct 2008 12:45:27 +0200 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <7E7E63F2-8CD1-4716-9E69-CBE44BBB36A7@carlsensei.com> References: <2F8D19E9-3A84-4B9A-A956-995A0E60F765@carlsensei.com> <34C0BC0E-DD4E-4142-8AA8-53B2CF23D6FA@carlsensei.com> <7E7E63F2-8CD1-4716-9E69-CBE44BBB36A7@carlsensei.com> Message-ID: <59a221a0810060345u4194ab20l4fbd1027d1d841d7@mail.gmail.com> Hmm, it's a nice new keyword doing interesting stuff, but I would prefer solving the scoping problems (I do consider it a problem) without introducing new keywords and such. But the right time to do the imannentization would be as late as possible, when the innermost scope is turned into a function by its parent scope, like Carl proposes. In Lisp, '(blah) can be evaluated by doing (eval '(blah)). In order to 'imannentize' the value of some variable into a quoted piece of code, you can do two things. Since code is just lists, you can use the standard list slice and dice functions. They als have some syntax for this which I think goes like this: `(blah foo ,bar baz) This returns a list that looks like this: (blah foo baz) But I don't think we should go all the way to quasiquoting (as this is called) unless we want to support full macros, and that isn't going to get past Guido. Jan 2008/10/6 Carl Johnson : > > On 2008/10/05, at 9:08 pm, Bruce Leban wrote: > >> Consider this: >> >> i = 0 >> def f(): >> global i >> i += 1 >> return lambda: immanentize 1 >> >> when does immanentize get evaluated? when f is defined or when the lambda >> is evaluated? From what you wrote, it sounds like you think it's evaluated >> when f is defined. OK, so how do I get the equivalent of: >> >> def f(): >> global i >> i += 1 >> lambda i=i: i >> >> using immanentize? > > > OK, now I see what you're getting at. That makes more sense. The question is > how do we deal with nested scopes with an immanentize in the innermost > scope. Off the top of my head, I think the most sensible way to do it is > that the immanentization happens when the innermost scope it's in is turned > into a real function by the next scope out. But I may need to do some more > thinking about what would happen here: > > def f(): > xs = [] > for i in range(10): > xs.append(lambda: immanentize i) > return xs > > Clearly, we want this immanentize to be held off on until f is finally > called. The more I think about it, the more I realize that an immanentize > always needs to be inside of some kind of function declaration, whether it's > a def or a lambda or (maybe) a class (but I haven't thought about the > details of that yet?), and evaluated just one level "up" from that. > >>>> def side_effector(): > ... r = random.randint(0, 10) > ... print(r) > ... return r > ... >>>> def f(): > ... def g(): > ... return immanentize side_effector() > ... return (immanentize side_effector()), g > 5 >>>> a, b = f() > 2 >>>> a > 5 >>>> b() > 2 >>>> b() > 2 >>>> a, b = f() > 3 >>>> a > 5 >>>> b() > 3 > > In which case, for your original example, the immanentization wouldn't > happen until f is called. > >>>> i = 0 >>>> def f(): > ... global i > ... i += 1 > ... return lambda: immanentize i > ... >>>> f() > at 0x3a57970> >>>> g = _ >>>> f() > at 0x3a58240> >>>> k = _ >>>> k() > 2 >>>> g() > 1 > > I think that makes sense, right? :/ > > Does anyone know how Lisp does this? I know that they write '(blah) to delay > evaluation, but what do they do when they want to force evaluation someplace > in the middle of a non-evaluated list? > > -- Carl > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > From jan.kanis at phil.uu.nl Mon Oct 6 13:54:53 2008 From: jan.kanis at phil.uu.nl (Jan Kanis) Date: Mon, 6 Oct 2008 13:54:53 +0200 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <48E8BF6E.6010602@ronadam.com> References: <48E5F5F3.3040000@canterbury.ac.nz> <48E675A0.9090501@doxdesk.com> <48E6B4D4.9040600@canterbury.ac.nz> <200810032100.22300.dillonco@comcast.net> <48E6DF56.4070601@canterbury.ac.nz> <48E8BF6E.6010602@ronadam.com> Message-ID: <59a221a0810060454w7a74e832n831410b6fd1b7774@mail.gmail.com> > There is a very simple and efficient way to implement > this in current CPython: If the loop variable is referenced > by a nested function, it will be in a cell. Instead of > rebinding the existing cell, each time around the loop > a new cell is created, replacing the previous cell. > Immediately before exiting the loop, one more new cell > is created and the final value of the loop variable > copied into it. [skip] > The benefit would be that almost all code involving > loops and nested functions would behave intuitively, +1 from me too. Neil Toronto wrote: > > Spanking good point. To hack this "properly" all cell variables closed over within the loop would have to go into the per-iteration scope. Agreed. And to preserve current semantics these values would need to be copied to the new scope of every next iteration (if it's closed over). >> It seems an odd sort of scope that lets rebindings inside it fall through outwards. > > True, but a lot of Python programs would depend on this - even new ones because of social inertia. It'll unfortunately have to wait til python 4000 :). I like finally fixing this problem, which I've also run into. But I don't like the idea of introducing a new keyword to create new scopes. I think all variable that are assigned to in a loop body and closed over should be put into a cell as Greg proposes, not just the index variable. (In both for and while loops.) At the end of each loop iteration all such variables would need to be copied to the 'next' scope, which could be the parent scope or the next iteration. I'm trying really hard to think about cases that would break if this new behaviour was introduced, but I can't think about anything. The only thing that would 'break' is if you would want the standard example of lst = [] for i in range(10): lst.append(lambda: i) for f in lst: print f() to actually print 9 times 10. But if you want your bunch of lambda functions that you create in the loop body to all refer to the last value of i, why on earth would you even attempt to create a whole bunch of lambdas in this way?? (except to show that for loops are broken) Dillon Collins wrote: >> >> Upon exiting the loop, the final value of the loop variable is copied into >> the surrounding scope > > > > So we are already dealing with at least one implicit copy. This is why I > think that the loop scoping suggestion should basically be dropped in favor > of some sort of scope block. Are you suggesting doing something with the scope of all block constructs? If the variables of the block are copied into their outer scope at the end, there really is allmost no difference between, say, an if block with it's own scope and the if blocks we have now. The only time it matters is if the block is executed multiple times and if a variable is closed over in it. So that's only in loops that create functions/lambdas in their bodies. If you are suggesting to only introduce a new 'scope' keyword or something like that and leave loops alone, I'd say I would prefer to fix the loops without introducing new grammar. Greg wrote: > > More generally, Python's inability to distinguish > clearly between creating new bindings and changing > existing bindings interacts badly with nested > functions. > > I agree that the wider problem needs to be addressed > somehow, and perhaps that should be the starting > point. Depending on the solution adopted, we can > then look at whether a change to the for-loop is > still needed. The (IMO) natural approach that many functional-minded languages take is to have each block-like structure create a new scope. I think your proposal with cells for loop variables would fix it for loops, if it is applied to all variables closed over in a loop. But in other block structures that aren't executed repeatedly, this problem doesn't come up. Are there any other problem cases that aren't taken care of by this proposal or the 'nonlocal' keyword? Andrew Clover wrote: > You mean: > > >>> i= 0 > >>> geti= lambda: i > > >>> for i in [1]: > ... print i is geti() > True > > >>> for i in [1]: > ... dummy= lambda: i > ... print i is geti() > False This does point at an implementation gotcha. The global i and the i in the 'current' loop scope have to be kept in sync. But actually having two variables and keeping them synchronized so they appear to be one is near impossible due to pythons dynamic nature and multithreading. So this would require the variable i to be both a global variable and a cell variable. The module namespace dict would need to point to the cell containing i. This would require the python interpreter to check on every global lookup if it is looking at the value itself or at a cell containing the value. Perhaps this could be done with an extra tag bit on the value pointers in the module namespace dictionaries. So for some annotated code: >>> i = 0 # globals()['i'] = 0 >>> geti = lambda: i # The lambda looks up its 'i' in the global namespace >>> for i in [1]: # The global is put into a cell if it is not already: # globals()['i'] = cell(globals()['i']) # Each iteration updates the cell with the new value # from the iterator: # globals()['i'].cellvalue = next(_iterator) ... dummy = lambda: i ... i += 1 # globals()['i'].cellvalue += 1 ... print i, i == geti() ... # At the end of every loop iteration # the cell is replaced with a fresh cell: # globals()['i'] = cell( globals()['i'].cellvalue ) 2 True Whether a global variable contains a value or a cell is something the interpreter needs to check at runtime, but I think it would not have to be very expensive. From arnodel at googlemail.com Mon Oct 6 14:31:11 2008 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Mon, 6 Oct 2008 13:31:11 +0100 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <59a221a0810060454w7a74e832n831410b6fd1b7774@mail.gmail.com> References: <48E5F5F3.3040000@canterbury.ac.nz> <48E675A0.9090501@doxdesk.com> <48E6B4D4.9040600@canterbury.ac.nz> <200810032100.22300.dillonco@comcast.net> <48E6DF56.4070601@canterbury.ac.nz> <48E8BF6E.6010602@ronadam.com> <59a221a0810060454w7a74e832n831410b6fd1b7774@mail.gmail.com> Message-ID: <9bfc700a0810060531v13bbd576k35f54b892e5527f1@mail.gmail.com> 2008/10/6 Jan Kanis : >> There is a very simple and efficient way to implement >> this in current CPython: If the loop variable is referenced >> by a nested function, it will be in a cell. Instead of >> rebinding the existing cell, each time around the loop >> a new cell is created, replacing the previous cell. >> Immediately before exiting the loop, one more new cell >> is created and the final value of the loop variable >> copied into it. > [skip] >> The benefit would be that almost all code involving >> loops and nested functions would behave intuitively, > > +1 from me too. > > > Neil Toronto wrote: >> >> Spanking good point. To hack this "properly" all cell variables closed over within the loop would have to go into the per-iteration scope. > > Agreed. And to preserve current semantics these values would need to > be copied to the new scope of every next iteration (if it's closed > over). > >>> It seems an odd sort of scope that lets rebindings inside it fall through outwards. >> >> True, but a lot of Python programs would depend on this - even new ones because of social inertia. > > It'll unfortunately have to wait til python 4000 :). > > > I like finally fixing this problem, which I've also run into. But I > don't like the idea of introducing a new keyword to create new scopes. > > I think all variable that are assigned to in a loop body and closed > over should be put into a cell as Greg proposes, not just the index > variable. (In both for and while loops.) At the end of each loop > iteration all such variables would need to be copied to the 'next' > scope, which could be the parent scope or the next iteration. > > I'm trying really hard to think about cases that would break if this > new behaviour was introduced, but I can't think about anything. The > only thing that would 'break' is if you would want the standard > example of > > lst = [] > for i in range(10): > lst.append(lambda: i) > for f in lst: > print f() > > to actually print 9 times 10. But if you want your bunch of lambda > functions that you create in the loop body to all refer to the last > value of i, why on earth would you even attempt to create a whole > bunch of lambdas in this way?? (except to show that for loops are > broken) How do you want this to behave? lst = [] a = [0] for i in range(10): a[0] = i lst.append(lambda: a[0]) for f in lst: print(f()) How about this? for a[0] in range(10): lst.append(lambda: a[0]) for f in lst: print(f()) ATM, I think this proposal will only make things more complicated from every point of view. -- Arnaud From tjreedy at udel.edu Mon Oct 6 17:59:34 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 06 Oct 2008 11:59:34 -0400 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <2F8D19E9-3A84-4B9A-A956-995A0E60F765@carlsensei.com> References: <2F8D19E9-3A84-4B9A-A956-995A0E60F765@carlsensei.com> Message-ID: Carl Johnson wrote: > Why does the evil default args hack work? Because it immediately > evaluates the argument and stores the result into the lambda object's > default values list. So, what we need is a keyword that means > "immediately evaluate the argument and store the result, instead of > looking up the name again later." Since I can't think of a good name for > this, I will use a terrible name for it, "immanentize." '@' would mostly work, '$' is available, and is used for string substitution in other languages. ... > One interesting side effect of having an immanentize keyword is that in > Python 4000, we could (if we wanted to) get rid of the supposed "wart" > of having x=[] as a default arg leading to unexpected results for Python > newbies. Just make it so that to get the current behavior you type > > >>> def f(x=immanentize []): > ... x.append(1) > ... return x > ... > >>> f() > [1] > >>> f() > [1, 1] As you said above, immanetize mean evaluate immediate, just as with default arg expressions, so immanetize can hardly mean anything extra when applied to default arg expressions. So you really need a new 'calltime' keyword for not immediate execution. Unless, of course, you are proposing that *all* default arg expressions be, by default, repeatedly evaluated at each call (thereby breaking all code depending on define-time evaluation). > Whereas, without immanentize we can do what newbies expect and evaluate > the defaults afresh each time. > > >>> def f(x=[]): > ... x.append(1) > ... return x > ... > >>> f() > [1] > >>> f() > [1] Only some newbies expect this. The ones like me who get that default args are evaluated just *once* never post complaints to c.l.p. It gives a completely biased sample. In any case, I don't see how you expect this to actually work. What object would you have put into the default arg tuple? What about a=[] def f(x=a): x.append(1) return x Would you have this magically modified also? Suppose instead of 'a=[]' we have 'from mod import a'. What about other mutable objects? Terry Jan Reedy From rrr at ronadam.com Mon Oct 6 19:03:12 2008 From: rrr at ronadam.com (Ron Adam) Date: Mon, 06 Oct 2008 12:03:12 -0500 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <2F8D19E9-3A84-4B9A-A956-995A0E60F765@carlsensei.com> References: <2F8D19E9-3A84-4B9A-A956-995A0E60F765@carlsensei.com> Message-ID: <48EA44D0.7040105@ronadam.com> Carl Johnson wrote: > Why does the evil default args hack work? Because it immediately > evaluates the argument and stores the result into the lambda object's > default values list. So, what we need is a keyword that means > "immediately evaluate the argument and store the result, instead of > looking up the name again later." Since I can't think of a good name for > this, I will use a terrible name for it, "immanentize." Yep, that is terrible! ;-) I'm really starting to see this as a non-problem after reading all these posts. This particular problem is solved much nicer with a class than a function. I have two reasons for this, one is classes are the natural structure to use if you want to save a state. And the other is I would actually prefer that functions never save states, or even closures. (but that would break a lot of decorators.) Here is an example that works as expected with no magic or hidden behaviors. class Caller(object): def __init__(self, f, *args, **kwds): self.f = f self.args = args self.kwds = kwds def __call__(self): return self.f(*self.args, **self.kwds) def id(i):return i L = [] for n in range(10): L.append(Caller(id, n)) for f in L: f() You could go one step further and make the class more specific to the situation it is being used by defining the __call__ method to do what you want instead of using a stored function reference. Something I think would be more beneficial to solve is to be able to pack and unpack entire function arguments into one signature object easily and automatically. def foo(***aks): ... Ron From tjreedy at udel.edu Mon Oct 6 19:07:47 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 06 Oct 2008 13:07:47 -0400 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <7E7E63F2-8CD1-4716-9E69-CBE44BBB36A7@carlsensei.com> References: <2F8D19E9-3A84-4B9A-A956-995A0E60F765@carlsensei.com> <34C0BC0E-DD4E-4142-8AA8-53B2CF23D6FA@carlsensei.com> <7E7E63F2-8CD1-4716-9E69-CBE44BBB36A7@carlsensei.com> Message-ID: Carl Johnson wrote: > > On 2008/10/05, at 9:08 pm, Bruce Leban wrote: > >> Consider this: >> >> i = 0 >> def f(): >> global i >> i += 1 >> return lambda: immanentize 1 >> >> when does immanentize get evaluated? when f is defined or when the >> lambda is evaluated? From what you wrote, it sounds like you think >> it's evaluated when f is defined. OK, so how do I get the equivalent of: >> >> def f(): >> global i >> i += 1 >> lambda i=i: i >> >> using immanentize? > > > OK, now I see what you're getting at. That makes more sense. The > question is how do we deal with nested scopes with an immanentize in the > innermost scope. Off the top of my head, I think the most sensible way > to do it is that the immanentization happens when the innermost scope > it's in is turned into a real function by the next scope out. But I may > need to do some more thinking about what would happen here: > > def f(): > xs = [] > for i in range(10): > xs.append(lambda: immanentize i) > return xs > > Clearly, we want this immanentize to be held off on until f is finally > called. Actually, until the lambda is executed. What you are saying is that you want immanetized expressions to be evaluated as the same time the default arg expressions are, which is when the def/lambda is executed to create a function object. The only difference between them and default args is that they could not be replaced by the function call, which is precise the problem with default pseudoargs. Call them constants defined at definition time rather than as compilation time. Do you want the constants to be names or anonymous -- which is to say, would there values appear in locals()? If named, their expressions could/should appear in the header with a syntax similar to but somehow different from default args. A possibility: ...lambda $i=i: i To implement this, the constants would have to be stored in the function object and not in the compiled code object that is shared by all functions generated by repeated execution of the definition. > The more I think about it, the more I realize that an > immanentize always needs to be inside of some kind of function > declaration, whether it's a def or a lambda or (maybe) a class (but I > haven't thought about the details of that yet?), and evaluated just one > level "up" from that. Immanetize would have be a no-op at top-level, as is global. Whether it should be illegal at top-level or not is a different questions. Global is legal even though it is redundant. If the constants are named and their definitions are in the function header, there would be no question of top-level appearance. Terry Jan Reedy From dillonco at comcast.net Tue Oct 7 14:07:29 2008 From: dillonco at comcast.net (Dillon Collins) Date: Tue, 7 Oct 2008 08:07:29 -0400 Subject: [Python-ideas] Py3k invalid unicode idea Message-ID: <200810070807.29422.dillonco@comcast.net> I don't know an awful lot about unicode, so rather than clog up the already lengthy threads on the 3k list, I figured I'd just toss this idea out over here. As I understand it, there is a fairly major issue regarding malformed unicode data being passed to python, particularly on startup and for filenames. This has lead to much discussion and ultimately the decision (?) to mirror a variety of OS functions to work with both bytes and unicode. Obviously this puts us on a slope of questionable friction to reverting back to 2.x where unicode wasn't "core". My thought is this: When passed invalid unicode, keep it invalid. This is largely similar to the UTF-8b ideas that were being tossed around, but a tad different. The idea would be to maintain invalid byte sequences by use of the private use area in the unicode spec, but be explicit about this conversion to the program. In particular, I'm suggesting the addition of the following (I'll use "surrogate" to refer to the invalid bytes in a unicode string): 1) Encoding 'raw'. Force all bytes to be converted to surrogate values. Decoding to raw converts the bytes back, and gives an error on valid unicode characters(!). This would enable applications to effectively interface with the system using bytes (by setting default encoding or the like), but not require any API changes to actually support the bytes type. 2) Error handler 'force' (or whatever). For decoding, when an invalid byte is encountered, replace with a surrogate. For encoding, write the invalid byte. 2a) Decoding invalid unicode or encoding a string with surrogates raises a UnicodeError (unless handler 'force' is specified or encoding is 'raw'). 3) string method 'valid'. 'valid()' would return False if the string contains at least one surrogate and True otherwise. This would allow programs to check if the string is correct, and handle it not. This would be of particular value when reading boot information like sys.argv as that would use the 'force' error handler in order to prevent boot failure. How the invalid bytes would be stored internally is certainly a matter of hot debate on the 3k list. As I mentioned before, I am not intimately familiar with unicode, so I don't have much to suggest. If I had to implement it myself now, I'd probably use a piece of the private use area as an escape (much like '\\' does). Finally, there seems to be much concern about internal invalid unicode wreaking havoc when tossed to external programs/libraries. I have to say that I don't really see what the problem is, because whenever python writes unicode, oughtn't it be buffered by "encode"? In that case you'd either get an error or would be explicitly allowing invalid strings (via 'raw' or 'force'). And besides, if python has to deal with bad unicode, these libraries should have to too ;). Even more finally, let me apologize in advance if I missed something on another list or this is otherwise too redundant. From jimjjewett at gmail.com Wed Oct 8 02:39:21 2008 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 7 Oct 2008 20:39:21 -0400 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: References: <2F8D19E9-3A84-4B9A-A956-995A0E60F765@carlsensei.com> <34C0BC0E-DD4E-4142-8AA8-53B2CF23D6FA@carlsensei.com> <7E7E63F2-8CD1-4716-9E69-CBE44BBB36A7@carlsensei.com> Message-ID: On 2008/10/6 Terry Reedy wrote: > you want immanetized expressions to be evaluated > as the same time the default arg expressions are, > which is when the def/lambda is executed to create > a function object. The only difference between them > and default args is that they could not be replaced > by the function call, which is precise the problem with > default pseudoargs. Call them constants defined > at definition time rather than as compilation time. I have some vague memory that these might be called &aux variables in Common Lisp. > Do you want the constants to be names or anonymous -- > which is to say, would there values appear in locals()? > If named, their expressions could/should appear in the > header with a syntax similar to but somehow > different from default args. A possibility: > ...lambda $i=i: i Would giving them a __name and marking them as keyword-only meet the goal? (As best I can tell, it does for methods, but not for top-level functions, because of the way __mangling works.) -jJ From bruce at leapyear.org Wed Oct 8 03:57:23 2008 From: bruce at leapyear.org (Bruce Leban) Date: Tue, 7 Oct 2008 18:57:23 -0700 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: References: <2F8D19E9-3A84-4B9A-A956-995A0E60F765@carlsensei.com> <34C0BC0E-DD4E-4142-8AA8-53B2CF23D6FA@carlsensei.com> <7E7E63F2-8CD1-4716-9E69-CBE44BBB36A7@carlsensei.com> Message-ID: &aux is described here: http://www.lispworks.com/documentation/HyperSpec/Body/03_dae.htm this says it's equivalent to let* which is described here: http://www.lispworks.com/documentation/HyperSpec/Body/s_let_l.htm In short &aux and let* evaluates each expression, assigns it to a variable and then evaluates the next, etc. Default values in python are evaluated in like Lisp's let, not let*. --- Bruce On Tue, Oct 7, 2008 at 5:39 PM, Jim Jewett wrote: > On 2008/10/6 Terry Reedy wrote: > > > you want immanetized expressions to be evaluated > > as the same time the default arg expressions are, > > which is when the def/lambda is executed to create > > a function object. The only difference between them > > and default args is that they could not be replaced > > by the function call, which is precise the problem with > > default pseudoargs. Call them constants defined > > at definition time rather than as compilation time. > > I have some vague memory that these might be called &aux variables in > Common Lisp. > > > Do you want the constants to be names or anonymous -- > > which is to say, would there values appear in locals()? > > If named, their expressions could/should appear in the > > header with a syntax similar to but somehow > > different from default args. A possibility: > > ...lambda $i=i: i > > Would giving them a __name and marking them as keyword-only meet the > goal? (As best I can tell, it does for methods, but not for top-level > functions, because of the way __mangling works.) > > -jJ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jan.kanis at phil.uu.nl Wed Oct 8 15:08:25 2008 From: jan.kanis at phil.uu.nl (Jan Kanis) Date: Wed, 8 Oct 2008 15:08:25 +0200 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <9bfc700a0810060531v13bbd576k35f54b892e5527f1@mail.gmail.com> References: <48E5F5F3.3040000@canterbury.ac.nz> <48E675A0.9090501@doxdesk.com> <48E6B4D4.9040600@canterbury.ac.nz> <200810032100.22300.dillonco@comcast.net> <48E6DF56.4070601@canterbury.ac.nz> <48E8BF6E.6010602@ronadam.com> <59a221a0810060454w7a74e832n831410b6fd1b7774@mail.gmail.com> <9bfc700a0810060531v13bbd576k35f54b892e5527f1@mail.gmail.com> Message-ID: <59a221a0810080608g506764f5k1f4c43adede9449a@mail.gmail.com> On 06/10/2008, Arnaud Delobelle wrote: > > How do you want this to behave? > > lst = [] > a = [0] > > for i in range(10): > > a[0] = i > lst.append(lambda: a[0]) > for f in lst: > print(f()) > > How about this? > > for a[0] in range(10): > lst.append(lambda: a[0]) > for f in lst: > print(f()) In my previous post I argued not to distinguish between the loop index variable and other 'loop-scope' variables, so both pieces of code should show the same behaviour. In what that behaviour is, there are two possibilities: 1) Treat arbitrary location specifications like a[0] the same as normal local/global variables. That would imply that both examples print 0, 1, ... 9. It would also imply that lists and other arbitrary python data structures need to be able to hold cells (over which the python interpreter transparently indirects, so they would not be directly visible to python code). So at the end of the second loop, the python memory layout looks like this: +---+ a ---> |[0]| -----------\ +---+ | lst | | | v | +---+ v |[9]| ---> lambda ---> cell ---> 9 |[ ]| |[8]| ---> lambda ---> cell ---> 8 |[ ]| |[7]| ---> lambda ---> cell ---> 7 |[ ]| ... (Note in case ascii art breaks: a[0] is pointing to the cell that holds the 9, the same one lst[9]'s lambda is pointing at.) But I think this would be a better solution: 2) Treat location specifiers other than local and global variables (variables you can write down without using dots or square brackets) the same as they are treated today. In that case, both loops would print ten times 9. I would want to argue this is the better approach, because when you write down a[0] you are specifically thinking of a and a[0] in terms of objects, while when you use a local variable you just need some place to store a temporary result, and not be bothered with it any more than absolutely necessary. However, I don't think anyone in their right mind would write loops using a[0] as the index variable. Have you ever had the need to do this? > ATM, I think this proposal will only make things more complicated from > every point of view. Partly true, more (variants of) proposals makes decisions harder. But you should read my proposal as merely extending Gregs original proposal of using cells in loop indexes to all variables that are used in loops, obviating the need for a separate scope/let construct. So I think it has a place in this discussion. Jan From bruce at leapyear.org Wed Oct 8 17:47:23 2008 From: bruce at leapyear.org (Bruce Leban) Date: Wed, 8 Oct 2008 08:47:23 -0700 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <59a221a0810080608g506764f5k1f4c43adede9449a@mail.gmail.com> References: <48E5F5F3.3040000@canterbury.ac.nz> <48E675A0.9090501@doxdesk.com> <48E6B4D4.9040600@canterbury.ac.nz> <200810032100.22300.dillonco@comcast.net> <48E6DF56.4070601@canterbury.ac.nz> <48E8BF6E.6010602@ronadam.com> <59a221a0810060454w7a74e832n831410b6fd1b7774@mail.gmail.com> <9bfc700a0810060531v13bbd576k35f54b892e5527f1@mail.gmail.com> <59a221a0810080608g506764f5k1f4c43adede9449a@mail.gmail.com> Message-ID: On Wed, Oct 8, 2008 at 6:08 AM, Jan Kanis wrote: > On 06/10/2008, Arnaud Delobelle wrote: > > for a[0] in range(10): > > lst.append(lambda: a[0]) > > for f in lst: > > print(f()) > > But I think this would be a better solution: > 2) Treat location specifiers other than local and global variables > (variables you can write down without using dots or square brackets) > the same as they are treated today. In that case, both loops would > print ten times 9. I would want to argue this is the better approach, > because when you write down a[0] you are specifically thinking of a > and a[0] in terms of objects, while when you use a local variable you > just need some place to store a temporary result, and not be bothered > with it any more than absolutely necessary. I don't think this is better. Not that I'm proposing we add macros to the language but if we did then macro find(a, x, i): for i in range(len(x)): if x[i]: return lambda: x[i] would operate differently for find(a, x, i) and find(a, x, i[0]). I think that's a bad idea. --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimjjewett at gmail.com Wed Oct 8 18:33:36 2008 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed, 8 Oct 2008 12:33:36 -0400 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: References: <2F8D19E9-3A84-4B9A-A956-995A0E60F765@carlsensei.com> <34C0BC0E-DD4E-4142-8AA8-53B2CF23D6FA@carlsensei.com> <7E7E63F2-8CD1-4716-9E69-CBE44BBB36A7@carlsensei.com> Message-ID: On Tue, Oct 7, 2008 at 9:57 PM, Bruce Leban wrote: > &aux is described here: > http://www.lispworks.com/documentation/HyperSpec/Body/03_dae.htm > > this says it's equivalent to let* which is described here: > http://www.lispworks.com/documentation/HyperSpec/Body/s_let_l.htm > In short &aux and let* evaluates each expression, assigns it to a variable > and then evaluates the next, etc. Default values in python are evaluated in > like Lisp's let, not let*. How do you figure? As nearly as I can tell, the only difference is that let* is evaluated in order (left-to-right) instead of in parallel. Python parameters are also evaluated left-to-right, as nearly as I can tell. >>> def f(): global var var="from f" >>> var="base" >>> def h(): print "var is", var >>> def g(a=f(), b=h()): print b var is from f This shows that the side effect of binding a was already present when b was bound. -jJ From bruce at leapyear.org Wed Oct 8 19:16:15 2008 From: bruce at leapyear.org (Bruce Leban) Date: Wed, 8 Oct 2008 10:16:15 -0700 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: References: <2F8D19E9-3A84-4B9A-A956-995A0E60F765@carlsensei.com> <34C0BC0E-DD4E-4142-8AA8-53B2CF23D6FA@carlsensei.com> <7E7E63F2-8CD1-4716-9E69-CBE44BBB36A7@carlsensei.com> Message-ID: Lisp's let: evaluate, evaluate, evaluate, assign, assign, assign Lisp's let*: evaluate, assign, evaluate, assign, evaluate, assign In Python as in Lisp, the side effects of the first evaluation are visible to the second but in Python and Lisp's let (vs. let*) the assignment of the first variable doesn't happen until after all the expressions have been evaluated. >>> def f(i=0, j=i+1): pass Traceback (most recent call last): File "", line 1, in def f(i=0, j=i+1): NameError: name 'i' is not defined On Wed, Oct 8, 2008 at 9:33 AM, Jim Jewett wrote: > On Tue, Oct 7, 2008 at 9:57 PM, Bruce Leban wrote: > > &aux is described here: > > http://www.lispworks.com/documentation/HyperSpec/Body/03_dae.htm > > > > this says it's equivalent to let* which is described here: > > http://www.lispworks.com/documentation/HyperSpec/Body/s_let_l.htm > > In short &aux and let* evaluates each expression, assigns it to a > variable > > and then evaluates the next, etc. Default values in python are evaluated > in > > like Lisp's let, not let*. > > > How do you figure? As nearly as I can tell, the only difference is > that let* is evaluated in order (left-to-right) instead of in > parallel. > > Python parameters are also evaluated left-to-right, as nearly as I can > tell. > > >>> def f(): > global var > var="from f" > >>> var="base" > >>> def h(): print "var is", var > > >>> def g(a=f(), b=h()): print b > > var is from f > > This shows that the side effect of binding a was already present when > b was bound. > > -jJ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Thu Oct 9 01:30:11 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 09 Oct 2008 12:30:11 +1300 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: References: <2F8D19E9-3A84-4B9A-A956-995A0E60F765@carlsensei.com> <34C0BC0E-DD4E-4142-8AA8-53B2CF23D6FA@carlsensei.com> <7E7E63F2-8CD1-4716-9E69-CBE44BBB36A7@carlsensei.com> Message-ID: <48ED4283.8040308@canterbury.ac.nz> Jim Jewett wrote: > How do you figure? As nearly as I can tell, the only difference is > that let* is evaluated in order (left-to-right) instead of in > parallel. That's not quite right. The difference between let and let* is that each expression in a let* is evaluated in a scope that includes the names bound by the previous expressions. In other words, it's equivalent to a nested sequence of let statements. > >>> def g(a=f(), b=h()): print b > > var is from f > > This shows that the side effect of binding a was already present when > b was bound. It's not about side effects, it's about name visibility. If the binding of function arguments worked like let*, then you would be able to refer to the name a in the expression being assigned to b, i.e. this would be legal: def g(a = f(), b = h(a)): ... But it's not -- you would get a NameError on a if you tried that. In that respect it's like let, not let*. -- Greg From stephen at xemacs.org Thu Oct 9 11:11:42 2008 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 09 Oct 2008 18:11:42 +0900 Subject: [Python-ideas] Py3k invalid unicode idea In-Reply-To: <200810070807.29422.dillonco@comcast.net> References: <200810070807.29422.dillonco@comcast.net> Message-ID: <874p3myvxt.fsf@xemacs.org> Dillon Collins writes: > My thought is this: When passed invalid unicode, keep it invalid. > This is largely similar to the UTF-8b ideas that were being tossed > around, but a tad different. The idea would be to maintain invalid > byte sequences by use of the private use area in the unicode spec, > but be explicit about this conversion to the program. FWIW this has been suggested several times. There are two problems with it. The first is collisions with other private space users. Unlikely, but it will (eventually) happen. When it does, it will very likely result in data corruption, because those systems will assume that these are valid private sequences, not reencoded pollution. One way to avoid this would be to have a configuration option (runtime) for where to start the private encoding space. It still won't avoid it completely because some applications don't know or care what is valid, and therefore might pass you anything. But mostly it should win because people who are assigning semantics to private space characters will need to know what characters they're using, and the others will rarely be able to detect corruption anyway. The second problem is that internal data will leak to other libraries. There is no reason to suppose that those libraries will see reencoded forms, because the whole point of using the C interface is to work directly on the Python data structures. At that point, you do have corruption, because the original invalid data has been converted to valid Unicode. You write "And besides, if python has to deal with bad unicode, these libraries should have to too ;)." Which is precisely right. The bug in your idea is that they never will! Your proposal robs them of the chance to do it in their own way by buffering it through Python's cleanup process. AFAICS, there are two sane paths. Accept (and document!) that you will pass corruption to other parts of the system, and munge bad octet sequences into some kind of valid Unicode (eg, U+FEFF REPLACEMENT CHARACTER, or a PUA encoding of raw bytes). Second, signal an error on encountering an invalid octet sequence, and leave it up to the user program to handle it. From jan.kanis at phil.uu.nl Thu Oct 9 12:06:14 2008 From: jan.kanis at phil.uu.nl (Jan Kanis) Date: Thu, 9 Oct 2008 12:06:14 +0200 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: References: <48E5F5F3.3040000@canterbury.ac.nz> <48E675A0.9090501@doxdesk.com> <48E6B4D4.9040600@canterbury.ac.nz> <200810032100.22300.dillonco@comcast.net> <48E6DF56.4070601@canterbury.ac.nz> <48E8BF6E.6010602@ronadam.com> <59a221a0810060454w7a74e832n831410b6fd1b7774@mail.gmail.com> <9bfc700a0810060531v13bbd576k35f54b892e5527f1@mail.gmail.com> <59a221a0810080608g506764f5k1f4c43adede9449a@mail.gmail.com> Message-ID: <59a221a0810090306l33e145bbicf66328615e8d7ad@mail.gmail.com> On 08/10/2008, Bruce Leban wrote: > I don't think this is better. Not that I'm proposing we add macros to the > language but if we did then > > macro find(a, x, i): > for i in range(len(x)): > if x[i]: > return lambda: x[i] > > would operate differently for find(a, x, i) and find(a, x, i[0]). I think > that's a bad idea. No, it wouldn't. The loop stops iterating as soon as one lambda is created, so the whole situation which started this thread, with multiple lambdas being created in a loop and each next iteration 'overwriting' the variable the previous lambdas refer to, does not occur. Your example would behave the same under current semantics, under my proposal, and under my proposal with complex variables also being 'celled' (the variant I don't prefer). Also, as you say, Python doesn't actually have macros and isn't going to get them any time soon. Saying that my proposal doesn't interact nicely with feature X which python doesn't have and is not going to get in the forseeable future is not really a convincing argument. You are more likely to convince me if you can show an actual example in which a[0] or an other non-simple variable is used as index variable. Like I said before, I don't think such a situation occurs, and if it does you'd want the behaviour I defended previously. Jan From dillonco at comcast.net Thu Oct 9 18:31:40 2008 From: dillonco at comcast.net (Dillon Collins) Date: Thu, 9 Oct 2008 12:31:40 -0400 Subject: [Python-ideas] Py3k invalid unicode idea In-Reply-To: <874p3myvxt.fsf@xemacs.org> References: <200810070807.29422.dillonco@comcast.net> <874p3myvxt.fsf@xemacs.org> Message-ID: <200810091231.40229.dillonco@comcast.net> On Thursday 09 October 2008, Stephen J. Turnbull wrote: > Dillon Collins writes: > > My thought is this: When passed invalid unicode, keep it invalid. > > This is largely similar to the UTF-8b ideas that were being tossed > > around, but a tad different. The idea would be to maintain invalid > > byte sequences by use of the private use area in the unicode spec, > > but be explicit about this conversion to the program. > > FWIW this has been suggested several times. There are two problems > with it. The first is collisions with other private space users. > Unlikely, but it will (eventually) happen. When it does, it will very > likely result in data corruption, because those systems will assume > that these are valid private sequences, not reencoded pollution. I certainly do agree that assuming PUA codes will never be used is foolish. As I suggested later on, you could use a PUA code as a sort of backslash escape to preserve both the valid PUA code and the invalid data. > > The second problem is that internal data will leak to other libraries. > There is no reason to suppose that those libraries will see reencoded > forms, because the whole point of using the C interface is to work > directly on the Python data structures. At that point, you do have > corruption, because the original invalid data has been converted to > valid Unicode. Yes and no... While C programs generally work on Python's internal data structures, they shouldn't (and basically don't) do so though direct access to the PyObject struct. Instead, they use the various macros/functions provided. With my proposal, unicode strings would have a valid flag, and one could easily modify PyUnicode_AS_UNICODE to return NULL (and a UnicodeError) if the string is invalid, and make a PyUnicode_AS_RAWUNICODE that wouldn't. Or you could simply document that libraries need to call a PyUnicode_ISVALID to determine whether or not the string contains invalid codes. > > You write "And besides, if python has to deal with bad unicode, these > libraries should have to too ;)." Which is precisely right. The bug > in your idea is that they never will! Your proposal robs them of the > chance to do it in their own way by buffering it through Python's > cleanup process. What makes this problem nasty all around is that your proposal has the same bug: by not allowing invalid unicode internally, the only way to allow programs to handle the (possible) problem is to always accept bytes, which would put us most of the way back to a 2.x world. At least with my proposal libraries can opt to deal with the bad, albeit slightly sanitized, unicode if it wants to. > > AFAICS, there are two sane paths. Accept (and document!) that you > will pass corruption to other parts of the system, and munge bad octet > sequences into some kind of valid Unicode (eg, U+FEFF REPLACEMENT > CHARACTER, or a PUA encoding of raw bytes). Second, signal an error > on encountering an invalid octet sequence, and leave it up to the user > program to handle it. Well, the bulk of my proposal was to allow the program to choose which one of those (3!) options they want. I fail to see the benefit of forcing their hands, especially since the API already supports this through the use of both codecs and error handlers. It just seems like a more elegant solution to me. From tjreedy at udel.edu Thu Oct 9 20:42:48 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 09 Oct 2008 14:42:48 -0400 Subject: [Python-ideas] Py3k invalid unicode idea In-Reply-To: <200810091231.40229.dillonco@comcast.net> References: <200810070807.29422.dillonco@comcast.net> <874p3myvxt.fsf@xemacs.org> <200810091231.40229.dillonco@comcast.net> Message-ID: Dillon Collins wrote: > On Thursday 09 October 2008, Stephen J. Turnbull wrote: > With my proposal, unicode strings would have a valid flag, and one could > easily modify PyUnicode_AS_UNICODE to return NULL (and a UnicodeError) if the > string is invalid, and make a PyUnicode_AS_RAWUNICODE that wouldn't. Or you > could simply document that libraries need to call a PyUnicode_ISVALID to > determine whether or not the string contains invalid codes. Would it make any sense to have a Filename subclass or a BadFilename subclass or more generally a PUAcode subclass for any unicode generated by the core that uses the PUA? In either of the latter cases, any app using the PUA would/could know not to mix PUAcode instances into their own unicode. And leakage without re-encoding into bytes could be inhibited more easily. tjr From stephen at xemacs.org Fri Oct 10 04:04:56 2008 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 10 Oct 2008 11:04:56 +0900 Subject: [Python-ideas] Py3k invalid unicode idea In-Reply-To: References: <200810070807.29422.dillonco@comcast.net> <874p3myvxt.fsf@xemacs.org> <200810091231.40229.dillonco@comcast.net> Message-ID: <87r66pxl13.fsf@xemacs.org> Terry Reedy writes: > Would it make any sense to have a Filename subclass Sure ... but as glyph has been explaining, that should really be generalized to a representation of filesystem paths, and that is an unsolved problem at the present time. > or a BadFilename subclass or more generally a PUAcode subclass for > any unicode generated by the core that uses the PUA? IMO, this doesn't work, because either they act like strings when you access them naively, and you end up with corrupt Unicode loose in the wider system, or they throw exceptions if they aren't first placated with appropriate rituals -- but those exceptions and rituals are what we wanted to avoid handling in the first place! As I see it, this is not a technical problem! It's a social problem. It's not that we have no good ways to handle Unicode exceptions: we have several. It's not that we have no perfect and universally acceptable way to handle them: as usual, that's way too much to ask. The problem that we face is that there are several good ways to handle the decoding exceptions, and different users/applications will *strongly* prefer different ones. In particular, if we provide one and make it default, almost all programmers will do the easy thing, so that most code will not be prepared for applications that do want a different handler. Code that does expect to receive uncorrupt Unicode will have to do extra checking, etc. I think that the best thing to do would be to improve the exception handling in codecs and library functions like os.listdir() -- IMO the problem that one exception can cost you an entire listing is a bug in os.listdir(). From stephen at xemacs.org Fri Oct 10 04:25:22 2008 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 10 Oct 2008 11:25:22 +0900 Subject: [Python-ideas] Py3k invalid unicode idea In-Reply-To: <200810091231.40229.dillonco@comcast.net> References: <200810070807.29422.dillonco@comcast.net> <874p3myvxt.fsf@xemacs.org> <200810091231.40229.dillonco@comcast.net> Message-ID: <87prm9xk31.fsf@xemacs.org> Dillon Collins writes: > It just seems like a more elegant solution to me. Like most problems rooted in POSIX (more fairly, in implementation dependencies), it's not a problem amenable to elegant solutions. The data is conceptually a human-readable string, and therefore should be representable in Unicode. In practice, it normally is, but there are no guarantees. IMO, in this kind of situation it is best to raise the exception as early as possible to preserve the context in which it occurred. I have no objection to providing a library of handlers to implement the strategies you propose, just to making any of them a Python core default. From jan.kanis at phil.uu.nl Fri Oct 10 12:14:10 2008 From: jan.kanis at phil.uu.nl (Jan Kanis) Date: Fri, 10 Oct 2008 12:14:10 +0200 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: References: <48E5F5F3.3040000@canterbury.ac.nz> <200810032100.22300.dillonco@comcast.net> <48E6DF56.4070601@canterbury.ac.nz> <48E8BF6E.6010602@ronadam.com> <59a221a0810060454w7a74e832n831410b6fd1b7774@mail.gmail.com> <9bfc700a0810060531v13bbd576k35f54b892e5527f1@mail.gmail.com> <59a221a0810080608g506764f5k1f4c43adede9449a@mail.gmail.com> <59a221a0810090306l33e145bbicf66328615e8d7ad@mail.gmail.com> Message-ID: <59a221a0810100314i27adfb56v6aefea299e98912e@mail.gmail.com> On 10/10/2008, Bruce Leban wrote: > > > On Thu, Oct 9, 2008 at 3:06 AM, Jan Kanis wrote: > > > > On 08/10/2008, Bruce Leban wrote: > > > I don't think this is better. Not that I'm proposing we add macros to > the > > > language but if we did then > > > > > > macro find(a, x, i): > > > for i in range(len(x)): > > > if x[i]: > > > return lambda: x[i] > > > > > > would operate differently for find(a, x, i) and find(a, x, i[0]). I > think > > > that's a bad idea. > > > OK, I screwed up that example. > > macro find(a, x, i): > for i in range(len(x)): > if x[i]: > z.append(lambda: x[i]) Yes, that one would behave differently... so IF macros were added (which they won't), and IF someone would find it usefull to pass in the index variable through the macro, instead of just using a regular local variable and not exposing it (for which I don't see any use), this macro would have to be SLIGHTLY REWRITTEN. (And as there are no macros yet, there's no backward compatibility problem). New version: macro find(a, x, i): for j in range(len(x)): i = j # but why would you want that? if x[j]: z.append(lambda: x[j]) Jan From arnodel at googlemail.com Fri Oct 10 21:51:37 2008 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Fri, 10 Oct 2008 20:51:37 +0100 Subject: [Python-ideas] if-syntax for regular for-loops In-Reply-To: <887847.72582.qm@web54409.mail.yahoo.com> References: <887847.72582.qm@web54409.mail.yahoo.com> Message-ID: <16349E71-021E-4409-AB0B-75D40502F9B7@googlemail.com> On 4 Oct 2008, at 00:42, Vitor Bosshard wrote: > >> On Fri, Oct 3, 2008 at 12:33 PM, Andreas Nilsson wrote: >>> Thanks for the pointer! >>> I don't buy the argument that newlines automagically improves >>> readability >>> though. You also get increased nesting suggesting something >>> interesting is >>> happening where it isn't and that hurts readability. >>> And as Vitor said, all other constructions of the form 'for i in >>> items' can >>> have if-conditions attached to them, it's really not that far- >>> fetched to >>> assume that the loop behaves the same way. Consistency good, >>> surprises bad. >> >> Yeah, I know what you mean, and I kind of liked the idea of adding >> the >> if statement to the for loop (for consistency, if nothing else), but >> it's been discussed before, and plenty of people have made the same >> argument. Probably not worth it. > > Besides consistency I think the one major benefit to for..if > loops is that often you don't just save a line, but also an > indentation > level (whenever you use the if clause solely as a filter), which > actually increases readability, specially when whatever you do within > the loop is > relatively long, with its own indentations. > > > The syntax just feels natural. For example: > > for i in somelist if i.pending: > > > > I really don't see any disadvantage here. There is also a problem with parsing as: * the following is correct: for i in L1 if cond else L2: do_something() * Python's grammar is LL(1) -- Arnaud From joe at strout.net Fri Oct 10 21:56:12 2008 From: joe at strout.net (Joe Strout) Date: Fri, 10 Oct 2008 13:56:12 -0600 Subject: [Python-ideas] Is this the right forum for a proposed new stdlib function? Message-ID: <9E08028E-B227-4730-A9FC-D9F416DF9F9D@strout.net> I'd like to float the idea of an addition of a new function to the Template class (in the string module). I'm a bit of a newbie around here, though, and uncertain of the proper procedure. Is this the right mailing list for that, or should I be using python-list instead? Thanks, - Joe From brett at python.org Fri Oct 10 23:51:39 2008 From: brett at python.org (Brett Cannon) Date: Fri, 10 Oct 2008 14:51:39 -0700 Subject: [Python-ideas] Is this the right forum for a proposed new stdlib function? In-Reply-To: <9E08028E-B227-4730-A9FC-D9F416DF9F9D@strout.net> References: <9E08028E-B227-4730-A9FC-D9F416DF9F9D@strout.net> Message-ID: On Fri, Oct 10, 2008 at 12:56 PM, Joe Strout wrote: > I'd like to float the idea of an addition of a new function to the Template > class (in the string module). I'm a bit of a newbie around here, though, > and uncertain of the proper procedure. Is this the right mailing list for > that, or should I be using python-list instead? > You can try here or there. Both places will provide feedback, although this list happens to tend to focus on language stuff. -Brett From joe at strout.net Sat Oct 11 00:50:07 2008 From: joe at strout.net (Joe Strout) Date: Fri, 10 Oct 2008 16:50:07 -0600 Subject: [Python-ideas] idea: Template.match function In-Reply-To: <9E08028E-B227-4730-A9FC-D9F416DF9F9D@strout.net> References: <9E08028E-B227-4730-A9FC-D9F416DF9F9D@strout.net> Message-ID: <18DB2ED7-142B-4CA5-8DF6-5278EE98742C@strout.net> OK, here's my pitch -- in rough PEP form, even though this may be small enough to not merit a PEP. I'd really like your feedback on this idea. Abstract I propose we add a new function on the string.Template [1] class, match(), to perform the approximate inverse of the existing substitute() function. That is, it attempts to match an input string against a template, and if successful, returns a dictionary providing the matched text for each template field. Rationale PEP 292 [2] added a simplified string substitution feature, allowing users to easily substitute text for named fields in a template string. The inverse operation is also useful: given a template and an input string, one wishes to find the text in the input string matching the fields in the template. However, Python currently has no easy way to do it. While this named matching operation can be accomplished using RegEx, the constructions required are somewhat complex and error prone. It can also be done using third-party modules such as pyparse, but again the setup requires more code and is not obvious to programmers inexperienced with that module. In addition, the Template class already has all the data needed to perform this operation, so it is a natural fit to simply add a new method on this class to perform a match, in addition to the existing method to perform a substitution. Proposal Proposed is the addition of one new attribute, and one new function, on the existing Template class, as follows: 1. 'greedy' is a new attribute that determines whether the field matches should be done in a greedy manner, equivalent to regex pattern '(.*)'; or in a non-greedy manner, equivalent to '(.*?)'. This attribute defaults to false. 2. 'match' is a new function which accepts one parameter, an input string. If the input string can be matched to the template pattern (respecting the 'greedy' flag), then match returns a dictionary, where each field in the pattern maps to the corresponding part of the input string. If the input string cannot be matched to the template pattern, then match returns NOne. Examples: >>> from string import Template >>> s = Template('$name was born in ${country}') >>> print s.match('Guido was born in the Netherlands') {'name':'Guido', 'country':'the Netherlands'} >>> print s.match('Spam was born as a canned ham') None Note that when the match is successful, the resulting dictionary could be passed through Template.substitute to reconstitute the original input string. Conversely, any string created by Template.substitute could be matched by Template.match (though in unusual cases, the resulting dictionary might not exactly match the original, e.g. if the string could be matched in multiple ways). Thus, .match and .substitute are inverse operations. References [1] Template Strings http://www.python.org/doc/2.5.2/lib/node40.html [2] PEP 292: Simpler String Substitutions http://www.python.org/dev/peps/pep-0292/ From jan.kanis at phil.uu.nl Sat Oct 11 02:09:33 2008 From: jan.kanis at phil.uu.nl (Jan Kanis) Date: Sat, 11 Oct 2008 02:09:33 +0200 Subject: [Python-ideas] idea: Template.match function In-Reply-To: <18DB2ED7-142B-4CA5-8DF6-5278EE98742C@strout.net> References: <9E08028E-B227-4730-A9FC-D9F416DF9F9D@strout.net> <18DB2ED7-142B-4CA5-8DF6-5278EE98742C@strout.net> Message-ID: <59a221a0810101709tb77d702h8cbf12a5b713370c@mail.gmail.com> > Proposed is the addition of one new attribute, and one new function, on the > existing Template class, as follows: > > 1. 'greedy' is a new attribute that determines whether the field matches > should be done in a greedy manner, equivalent to regex pattern '(.*)'; or in > a non-greedy manner, equivalent to '(.*?)'. This attribute defaults to > false. I don't want to commit to whether this should be in the stdlib or not, but on the design part I'd say it would be better to make 'greedy' an optional parameter to the match method. It's only used in one method and not really a property of the template, but of the matching: >>> print s.match('Guido was born in the Netherlands', greedy=True) Jan From joe at strout.net Sat Oct 11 03:23:20 2008 From: joe at strout.net (Joe Strout) Date: Fri, 10 Oct 2008 19:23:20 -0600 Subject: [Python-ideas] idea: Template.match function In-Reply-To: <59a221a0810101709tb77d702h8cbf12a5b713370c@mail.gmail.com> References: <9E08028E-B227-4730-A9FC-D9F416DF9F9D@strout.net> <18DB2ED7-142B-4CA5-8DF6-5278EE98742C@strout.net> <59a221a0810101709tb77d702h8cbf12a5b713370c@mail.gmail.com> Message-ID: <5D1CC24E-87CE-4E31-A29B-BAEBAB5603E4@strout.net> On Oct 10, 2008, at 6:09 PM, Jan Kanis wrote: > I don't want to commit to whether this should be in the stdlib or not, > but on the design part I'd say it would be better to make 'greedy' an > optional parameter to the match method. It's only used in one method > and not really a property of the template, but of the matching: > >>>> print s.match('Guido was born in the Netherlands', greedy=True) That's an excellent point. I had it as a property because of the way my prototype implementation worked, but now that I look at it again, there's no good reason it has to work that way. (We probably want to cache the compiled regex object under the hood, but we can store which greediness option was used, or even cache them both -- all internal implementation detail that the user shouldn't care about.) Thanks, - Joe From jared.grubb at gmail.com Sat Oct 11 03:38:41 2008 From: jared.grubb at gmail.com (Jared Grubb) Date: Fri, 10 Oct 2008 20:38:41 -0500 Subject: [Python-ideas] idea: Template.match function In-Reply-To: <18DB2ED7-142B-4CA5-8DF6-5278EE98742C@strout.net> References: <9E08028E-B227-4730-A9FC-D9F416DF9F9D@strout.net> <18DB2ED7-142B-4CA5-8DF6-5278EE98742C@strout.net> Message-ID: <5936F297-CF5D-46C6-A181-FB9E29C307B1@gmail.com> On 10 Oct 2008, at 17:50, Joe Strout wrote: > >>> from string import Template > >>> s = Template('$name was born in ${country}') > >>> print s.match('Guido was born in the Netherlands') > {'name':'Guido', 'country':'the Netherlands'} > >>> print s.match('Spam was born as a canned ham') > None You can basically do this using regular expressions; it's not as "pretty", but it does exactly the same thing: >>> s = re.compile('(?P.*) was born in (?P.*)') >>> print s.match( 'Guido was born in the Netherlands').groupdict() {'country': 'the Netherlands', 'name': 'Guido'} >>> print s.match('Spam was born as a canned ham') None And if you wanted non-greedy, just replace ".*" with ".*?". Jared From joe at strout.net Sat Oct 11 03:49:02 2008 From: joe at strout.net (Joe Strout) Date: Fri, 10 Oct 2008 19:49:02 -0600 Subject: [Python-ideas] idea: Template.match function In-Reply-To: <5936F297-CF5D-46C6-A181-FB9E29C307B1@gmail.com> References: <9E08028E-B227-4730-A9FC-D9F416DF9F9D@strout.net> <18DB2ED7-142B-4CA5-8DF6-5278EE98742C@strout.net> <5936F297-CF5D-46C6-A181-FB9E29C307B1@gmail.com> Message-ID: <7517F8D5-B172-4FAE-9D8E-4C930BD057D8@strout.net> On Oct 10, 2008, at 7:38 PM, Jared Grubb wrote: > You can basically do this using regular expressions; it's not as > "pretty", but it does exactly the same thing That's true; and you can use % to do the same thing as Template.substitute (though it's not as pretty). The point is, we already have a very pretty Template class that does this operation in one direction; it ought to do it in the other direction too. The fact that it doesn't is surprising to a newbie (speaking from personal experience there), and the equivalent 're' incantation is considerably harder to come up with -- even more so than using % is harder than Template.substitute. Best, - Joe From george.sakkis at gmail.com Sat Oct 11 04:47:41 2008 From: george.sakkis at gmail.com (George Sakkis) Date: Fri, 10 Oct 2008 22:47:41 -0400 Subject: [Python-ideas] idea: Template.match function In-Reply-To: <18DB2ED7-142B-4CA5-8DF6-5278EE98742C@strout.net> References: <9E08028E-B227-4730-A9FC-D9F416DF9F9D@strout.net> <18DB2ED7-142B-4CA5-8DF6-5278EE98742C@strout.net> Message-ID: <91ad5bf80810101947i4c2469fcyb030b5349dc2fca9@mail.gmail.com> On Fri, Oct 10, 2008 at 6:50 PM, Joe Strout wrote: Proposed is the addition of one new attribute, and one new function, on the > existing Template class, as follows: > > 1. 'greedy' is a new attribute that determines whether the field matches > should be done in a greedy manner, equivalent to regex pattern '(.*)'; or in > a non-greedy manner, equivalent to '(.*?)'. This attribute defaults to > false. > > 2. 'match' is a new function which accepts one parameter, an input string. > If the input string can be matched to the template pattern (respecting the > 'greedy' flag), then match returns a dictionary, where each field in the > pattern maps to the corresponding part of the input string. If the input > string cannot be matched to the template pattern, then match returns NOne. > > Examples: > > >>> from string import Template > >>> s = Template('$name was born in ${country}') > >>> print s.match('Guido was born in the Netherlands') > {'name':'Guido', 'country':'the Netherlands'} One objection is that the hardcoded pattern '(.*)' or '(.*?)' doesn't seem generally applicable; e.g. the example above would break if the sentence continued "..in the Netherlands at 19XX". It might be possible to generalize it (e.g. by passing keyword arguments with the expected regexp for each template variable, such as "name=r'.*'', country=r'\w+'") but in this case you might as well use an explicit regexp. Regardless, you'll need more examples and more compelling use cases before this has any chance to move forward. You may start from the stdlib and see how much things could be simplified if Template.match was available. George -------------- next part -------------- An HTML attachment was scrubbed... URL: From dbpokorny at gmail.com Sun Oct 12 20:38:35 2008 From: dbpokorny at gmail.com (dbpokorny at gmail.com) Date: Sun, 12 Oct 2008 11:38:35 -0700 (PDT) Subject: [Python-ideas] if-syntax for regular for-loops In-Reply-To: <16349E71-021E-4409-AB0B-75D40502F9B7@googlemail.com> References: <887847.72582.qm@web54409.mail.yahoo.com> <16349E71-021E-4409-AB0B-75D40502F9B7@googlemail.com> Message-ID: <19ea21ff-3164-443c-86a2-c9ae8c11c494@a19g2000pra.googlegroups.com> On Oct 10, 12:51?pm, Arnaud Delobelle wrote: > There is also a problem with parsing as: > > * the following is correct: > > ? ? ?for i in L1 if cond else L2: > ? ? ? ? do_something() > > * Python's grammar is LL(1) Not really. for_stmt: 'for' exprlist 'in' testlist ['if' or_test] ':' suite ['else' ':' suite] FWIW, I think this would be an entirely different discussion if someone did all of the work first (code+docs+test code), and THEN went to python-dev along with two or three examples from the standard library where the new syntax would *add* clarity. Side note: not to discourage anyone, but I happened to look at Lib/ pydoc.py and found several examples where, due to line-length, forcing the new syntax at every available opportunity would either substantially reduce clarity or be more or less pointless. Here are a couple cases so you can make up you own mind. #1 original: for ext in ('.py', '.pyc', '.pyo'): if os.path.isfile(os.path.join(path, '__init__' + ext)): return True vs. new syntax: for ext in ('.py', '.pyc', '.pyo') if os.path.isfile(os.path.join(path, '__init__' + ext)): return True vs. existing alternative: if any( os.path.isfile(os.path.join(path,'__init__' + ext)) for ext in ('.py','.pyc','.pyo')): return True #2 original: for dict in dicts: if name in dict: return '%s' % (dict[name], name) new syntax: for dict in dicts if name in dict: return '%s' % (dict[name], name) Cheers, David From arnodel at googlemail.com Sun Oct 12 21:21:33 2008 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Sun, 12 Oct 2008 20:21:33 +0100 Subject: [Python-ideas] if-syntax for regular for-loops In-Reply-To: <19ea21ff-3164-443c-86a2-c9ae8c11c494@a19g2000pra.googlegroups.com> References: <887847.72582.qm@web54409.mail.yahoo.com> <16349E71-021E-4409-AB0B-75D40502F9B7@googlemail.com> <19ea21ff-3164-443c-86a2-c9ae8c11c494@a19g2000pra.googlegroups.com> Message-ID: On 12 Oct 2008, at 19:38, dbpokorny at gmail.com wrote: > > On Oct 10, 12:51 pm, Arnaud Delobelle wrote: >> There is also a problem with parsing as: >> >> * the following is correct: >> >> for i in L1 if cond else L2: >> do_something() >> >> * Python's grammar is LL(1) > > Not really. > What do you mean? Python's grammar is not LL(1)? Or Python + for-in-if statements is still LL(1)? > for_stmt: 'for' exprlist 'in' testlist ['if' or_test] ':' suite > ['else' ':' suite] What does that prove? If I show you: for i in L if How do you know, without looking at further tokens, whether the 'if' is part of an if-expression or part of a for-in-if statement? -- Arnaud From dbpokorny at gmail.com Mon Oct 13 00:32:03 2008 From: dbpokorny at gmail.com (dbpokorny at gmail.com) Date: Sun, 12 Oct 2008 15:32:03 -0700 (PDT) Subject: [Python-ideas] if-syntax for regular for-loops In-Reply-To: References: <887847.72582.qm@web54409.mail.yahoo.com> <16349E71-021E-4409-AB0B-75D40502F9B7@googlemail.com> <19ea21ff-3164-443c-86a2-c9ae8c11c494@a19g2000pra.googlegroups.com> Message-ID: <791ac5df-ddc4-40bd-957e-3a486e6e40e6@1g2000prd.googlegroups.com> On Oct 12, 12:21?pm, Arnaud Delobelle wrote: > On 12 Oct 2008, at 19:38, dbpoko... at gmail.com wrote: > > > > > On Oct 10, 12:51 pm, Arnaud Delobelle wrote: > >> There is also a problem with parsing as: > > >> * the following is correct: > > >> ? ? ?for i in L1 if cond else L2: > >> ? ? ? ? do_something() > > >> * Python's grammar is LL(1) > > > Not really. > > What do you mean? Python's grammar is not LL(1)? Or Python + for-in-if ? > statements is still LL(1)? Oops! Here is what I should have said: if you replace for_stmt: 'for' exprlist 'in' testlist ':' suite ['else' ':' suite] in Grammar/Grammar with the following line for_stmt: 'for' exprlist 'in' testlist_safe ['if' old_test] ':' suite ['else' ':' suite] Then the grammar is still LL(1) since it resembles the LC syntax. I neglected to turn the testlist into a testlist_safe. Now in theory this could break code...it would break anything like for x in my_list if some_condition else []: ... Now this wouldn't even be a potential problem if Python converted to a GLR parser, but that's another discussion. > > > for_stmt: 'for' exprlist 'in' testlist ['if' or_test] ':' suite > > ['else' ':' suite] > > What does that prove? > > If I show you: > > ? ? for i in L if > > How do you know, without looking at further tokens, ?whether the 'if' ? > is part of an if-expression or part of a for-in-if statement? You can say the same thing about an LC. The answer in that situation is the last token ('if') maps to the first element of the right-hand side of the list_if production. David From jeetsukumaran at gmail.com Mon Oct 13 00:45:26 2008 From: jeetsukumaran at gmail.com (Jeet Sukumaran) Date: Sun, 12 Oct 2008 17:45:26 -0500 Subject: [Python-ideas] Using Git to Manage Python Installation(s) Message-ID: <6d7436060810121545k213172afv37a22eb319e2d414@mail.gmail.com> I've recently started using Git to manage the Python installations on my system (2.4, 2.5, 2.6 etc.),, and I have found that it has revolutionized my development ecosystem. Full versioning + documentation of all changes to each of the frameworks (including reversion in case something breaks), multiple parallel variants/package combinations of each installation (including a clean vanilla build) with seamless in-situ switching among them, easy instantiation and disposal of sandbox/experimental branches, etc. etc. I've written more about it here: http://jeetworks.org/node/24 It probably lies outside the purview of the development of Python itself (as opposed to Python development), but I thought I share this with you because I find it so useful! Furthermore, as I mention at the end of the above-referenced post, at the moment the big downside to this approach is the extra book-keeping burden that falls on the user. So much of this can be automated, though, with a distutils or setuptools post-install hook ... so maybe that might be something to consider? -------------- next part -------------- An HTML attachment was scrubbed... URL: From george.sakkis at gmail.com Mon Oct 13 02:24:19 2008 From: george.sakkis at gmail.com (George Sakkis) Date: Sun, 12 Oct 2008 20:24:19 -0400 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <48E6DB55.9010902@canterbury.ac.nz> References: <48E5F5F3.3040000@canterbury.ac.nz> <48E6A1EF.8000809@cs.byu.edu> <48E6DB55.9010902@canterbury.ac.nz> Message-ID: <91ad5bf80810121724h9cb3365w787fbad8a98d459@mail.gmail.com> On Fri, Oct 3, 2008 at 10:56 PM, Greg Ewing wrote: > Guido van Rossum wrote: > > This leads me to reject claims that "the for-loop is broken" and in >> particular clamoring for fixing the for-loop without allowing us to >> fix this example. >> > > Yeah, I never said the for-loop was the *only* > thing that's broken. :-) > > Perhaps "broken" is too strong a word. What I really > mean is that it's designed in a way that interacts > badly with nested functions. > > More generally, Python's inability to distinguish > clearly between creating new bindings and changing > existing bindings interacts badly with nested > functions. > > I agree that the wider problem needs to be addressed > somehow, and perhaps that should be the starting > point. Depending on the solution adopted, we can > then look at whether a change to the for-loop is > still needed. Since this idea didn't get much steam, a more modest proposal would be to relax the restriction on cells: allow the creation of new cells and the rebinding of func_closure in pure Python. Then one could explicitly create a new scope without any other change in the language through a 'localize' decorator that would create a new cell for every free variable (i.e. global or value from an enclosing scope) of the function: lst = [] for i in range(10): @localize def f(): print i lst.append(f) lst.append(localize(lambda: i**2)) I'd love to be proven wrong but I don't think localize() can be implemented in current Python. George -------------- next part -------------- An HTML attachment was scrubbed... URL: From prouleau001 at gmail.com Mon Oct 13 04:41:09 2008 From: prouleau001 at gmail.com (Pierre Rouleau) Date: Sun, 12 Oct 2008 22:41:09 -0400 Subject: [Python-ideas] Using Git to Manage Python Installation(s) In-Reply-To: <6d7436060810121545k213172afv37a22eb319e2d414@mail.gmail.com> References: <6d7436060810121545k213172afv37a22eb319e2d414@mail.gmail.com> Message-ID: <5c1d522d0810121941o5c1e95e0ld83288004bbccf13@mail.gmail.com> On Sun, Oct 12, 2008 at 6:45 PM, Jeet Sukumaran wrote: > I've recently started using Git to manage the Python installations on my > system (2.4, 2.5, 2.6 etc.),, and I have found that it has revolutionized my > development ecosystem. Did you try Mercurial as well? -- Pierre Rouleau From greg.ewing at canterbury.ac.nz Mon Oct 13 10:12:58 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 13 Oct 2008 21:12:58 +1300 Subject: [Python-ideas] Idea: Lazy import statement Message-ID: <48F3030A.6050107@canterbury.ac.nz> Problem: You have a package containing a large number of classes, of which only a few are typically used by any given application. If you put each class into its own submodule, then client code is required to use a lot of tedious 'from foo.thingy import Thingy' statements to import the classes it wants to use. This also makes all the submodule names part of the API and makes it hard to rearrange the packaging without breaking code. If you try to flatten the namespace by importing all the classes into the top level module, you end up importing everything even if it won't be used. What's needed is a way of lazily importing them, so that the import won't actually happen unless the imported names are referenced. It's possible to hack something like that up now, but then tools such as py2app and py2exe, that try to find modules by statically examining the source looking for import statements, won't be able to accurately determine which modules are used. At best they'll think the whole package is used and incorporate all of it; at worst they'll miss it altogether. So I think it would be good to have a dedicated syntax for lazy imports, so the top-level foo package can say something like from foo.thing lazily import Thing from foo.stuff lazily import Stuff ... Executing a lazy import statement adds an entry to a list of deferred imports attached to the module. Then, the first time the imported name is referenced, the import is performed and the name becomes an ordinary attribute thereafter. If py2exe et al are taught about lazy imports, they will then be able to determine exactly which submodules are used by an application and exclude the rest. -- Greg From arnodel at googlemail.com Mon Oct 13 14:09:29 2008 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Mon, 13 Oct 2008 13:09:29 +0100 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <91ad5bf80810121724h9cb3365w787fbad8a98d459@mail.gmail.com> References: <48E5F5F3.3040000@canterbury.ac.nz> <48E6A1EF.8000809@cs.byu.edu> <48E6DB55.9010902@canterbury.ac.nz> <91ad5bf80810121724h9cb3365w787fbad8a98d459@mail.gmail.com> Message-ID: <9bfc700a0810130509t4657fdc0rcab12153416b7534@mail.gmail.com> > Since this idea didn't get much steam, a more modest proposal would be to > relax the restriction on cells: allow the creation of new cells and the > rebinding of func_closure in pure Python. Then one could explicitly create a > new scope without any other change in the language through a 'localize' > decorator that would create a new cell for every free variable (i.e. global > or value from an enclosing scope) of the function: > > lst = [] > for i in range(10): > @localize > def f(): print i > lst.append(f) > lst.append(localize(lambda: i**2)) > > I'd love to be proven wrong but I don't think localize() can be implemented > in current Python. I think you probably can in CPython, but that would involve bytecode introspection and using ctypes.pythonapi.PyCell_New, and it would be terribly inefficient. I wrote a similar decorator that takes a function and bind some of its variables to some values. e.g @bind(x=40, y=2) def foo(): return x+y >>> foo() 42 It's useless of course. -- Arnaud From george.sakkis at gmail.com Mon Oct 13 16:22:21 2008 From: george.sakkis at gmail.com (George Sakkis) Date: Mon, 13 Oct 2008 10:22:21 -0400 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <9bfc700a0810130509t4657fdc0rcab12153416b7534@mail.gmail.com> References: <48E5F5F3.3040000@canterbury.ac.nz> <48E6A1EF.8000809@cs.byu.edu> <48E6DB55.9010902@canterbury.ac.nz> <91ad5bf80810121724h9cb3365w787fbad8a98d459@mail.gmail.com> <9bfc700a0810130509t4657fdc0rcab12153416b7534@mail.gmail.com> Message-ID: <91ad5bf80810130722o58891e0al2bf4757984aa0c81@mail.gmail.com> On Mon, Oct 13, 2008 at 8:09 AM, Arnaud Delobelle wrote: > Since this idea didn't get much steam, a more modest proposal would be to > > relax the restriction on cells: allow the creation of new cells and the > > rebinding of func_closure in pure Python. Then one could explicitly > create a > > new scope without any other change in the language through a 'localize' > > decorator that would create a new cell for every free variable (i.e. > global > > or value from an enclosing scope) of the function: > > > > lst = [] > > for i in range(10): > > @localize > > def f(): print i > > lst.append(f) > > lst.append(localize(lambda: i**2)) > > > > I'd love to be proven wrong but I don't think localize() can be > implemented > > in current Python. > > I think you probably can in CPython, but that would involve bytecode > introspection and using ctypes.pythonapi.PyCell_New, and it would be > terribly inefficient. I wrote a similar decorator that takes a > function and bind some of its variables to some values. e.g > > @bind(x=40, y=2) > def foo(): return x+y > > >>> foo() > 42 > > It's useless of course. Can you expand a bit on that ? Why would it be terribly inefficient and why is it useless ? George -------------- next part -------------- An HTML attachment was scrubbed... URL: From joe at strout.net Mon Oct 13 18:16:07 2008 From: joe at strout.net (Joe Strout) Date: Mon, 13 Oct 2008 10:16:07 -0600 Subject: [Python-ideas] Where/how to propose an addition to a standard module? In-Reply-To: <89e81dd6-a623-4d86-bf50-b0c3220de71c@y29g2000hsf.googlegroups.com> References: <89e81dd6-a623-4d86-bf50-b0c3220de71c@y29g2000hsf.googlegroups.com> Message-ID: <6FCB9EB6-4E1F-44EE-B670-538634977BB1@strout.net> On Oct 13, 2008, at 8:46 AM, pruebauno at latinmail.com wrote: > Whenever I needed such functionality I used the re module. The benefit > is that it uses unix style regular expression syntax and an egrep/awk/ > perl/ruby user can understand it. You should show a few examples where > your proposal looks better than just using RE. Well, I suppose if you're already used to RE, then maybe it's not obvious that to an RE newbie, this: regex = re.compile("The (?P.*?) in (?P.*?) falls mainly in the (?P.*?).") d = regex.match(text).groupdict() is far harder to read and type correctly than this: templ = Template("This $object in $location falls mainly in the $subloc") d = templ.match(text) Any other example would show the same simplification. Of course, if you're the sort of person who uses RE, you probably don't use Template.substitute either, since you probably like and are comfortable with the string % operator. But Template.substitute was introduced to make it easier to handle the common, simple substitution operations, and I believe adding a Template.match method would do the same thing for common, simple matching operations. Here's a more fleshed-out proposal, with rationale and references -- see if this makes it any clearer why I think this would be a fine addition to the Template class. Abstract Introduces a new function on the string.Template [1] class, match(), to perform the approximate inverse of the existing substitute() function. That is, it attempts to match an input string against a template, and if successful, returns a dictionary providing the matched text for each template field. Rationale PEP 292 [2] added a simplified string substitution feature, allowing users to easily substitute text for named fields in a template string. The inverse operation is also useful: given a template and an input string, one wishes to find the text in the input string matching the fields in the template. However, Python currently has no easy way to do it. While this named matching operation can be accomplished using RegEx, the constructions required are somewhat complex and error prone. It can also be done using third-party modules such as pyparse, but again the setup requires more code and is not obvious to programmers inexperienced with that module. In addition, the Template class already has all the data needed to perform this operation, so it is a natural fit to simply add a new method on this class to perform a match, in addition to the existing method to perform a substitution. Proposal Proposed is the addition of one new function, on the existing Template class, as follows: def match(text, greedy=false) 'match' is a new function which accepts one required parameter, an input string; and one optional parameter, 'greedy', which determines whether matches should be done in a greedy manner, equivalent to regex pattern '(.*)'; or in a non-greedy manner, equivalent to '(.*?)'. If the input string can be matched to the template pattern (respecting the 'greedy' flag), then match returns a dictionary, where each field in the pattern maps to the corresponding part of the input string. If the input string cannot be matched to the template pattern, then match returns None. Examples: >>> from string import Template >>> s = Template('$name was born in ${country}') >>> print s.match('Guido was born in the Netherlands') {'name':'Guido', 'country':'the Netherlands'} >>> print s.match('Spam was born as a canned ham') None Note that when the match is successful, the resulting dictionary could be passed through Template.substitute to reconstitute the original input string. Conversely, any string created by Template.substitute could be matched by Template.match (though in unusual cases, the resulting dictionary might not exactly match the original, e.g. if the string could be matched in multiple ways). Thus, .match and .substitute are inverse operations. References [1] Template Strings http://www.python.org/doc/2.5.2/lib/node40.html [2] PEP 292: Simpler String Substitutions http://www.python.org/dev/peps/pep-0292/ From steven.bethard at gmail.com Mon Oct 13 18:51:27 2008 From: steven.bethard at gmail.com (Steven Bethard) Date: Mon, 13 Oct 2008 10:51:27 -0600 Subject: [Python-ideas] Where/how to propose an addition to a standard module? In-Reply-To: <6FCB9EB6-4E1F-44EE-B670-538634977BB1@strout.net> References: <89e81dd6-a623-4d86-bf50-b0c3220de71c@y29g2000hsf.googlegroups.com> <6FCB9EB6-4E1F-44EE-B670-538634977BB1@strout.net> Message-ID: On Mon, Oct 13, 2008 at 10:16 AM, Joe Strout wrote: > >>> from string import Template > >>> s = Template('$name was born in ${country}') > >>> print s.match('Guido was born in the Netherlands') > {'name':'Guido', 'country':'the Netherlands'} > >>> print s.match('Spam was born as a canned ham') > None If I were proposing something like this, I'd be using the new formatting syntax that's supposed to become the standard in Python 3.0: http://docs.python.org/dev/3.0/library/string.html#format-string-syntax That would mean something like:: '{name} was born in {country}' Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy From arnodel at googlemail.com Mon Oct 13 20:05:44 2008 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Mon, 13 Oct 2008 19:05:44 +0100 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <91ad5bf80810130722o58891e0al2bf4757984aa0c81@mail.gmail.com> References: <48E5F5F3.3040000@canterbury.ac.nz> <48E6A1EF.8000809@cs.byu.edu> <48E6DB55.9010902@canterbury.ac.nz> <91ad5bf80810121724h9cb3365w787fbad8a98d459@mail.gmail.com> <9bfc700a0810130509t4657fdc0rcab12153416b7534@mail.gmail.com> <91ad5bf80810130722o58891e0al2bf4757984aa0c81@mail.gmail.com> Message-ID: <9BFBA567-9346-4376-94AB-EE38E817EBFD@googlemail.com> On 13 Oct 2008, at 15:22, George Sakkis wrote: > On Mon, Oct 13, 2008 at 8:09 AM, Arnaud Delobelle > wrote: > > > Since this idea didn't get much steam, a more modest proposal > would be to > > relax the restriction on cells: allow the creation of new cells > and the > > rebinding of func_closure in pure Python. Then one could > explicitly create a > > new scope without any other change in the language through a > 'localize' > > decorator that would create a new cell for every free variable > (i.e. global > > or value from an enclosing scope) of the function: > > > > lst = [] > > for i in range(10): > > @localize > > def f(): print i > > lst.append(f) > > lst.append(localize(lambda: i**2)) > > > > I'd love to be proven wrong but I don't think localize() can be > implemented > > in current Python. > > I think you probably can in CPython, but that would involve bytecode > introspection and using ctypes.pythonapi.PyCell_New, and it would be > terribly inefficient. I wrote a similar decorator that takes a > function and bind some of its variables to some values. e.g > > @bind(x=40, y=2) > def foo(): return x+y > > >>> foo() > 42 > > It's useless of course. > > Can you expand a bit on that ? Why would it be terribly inefficient > and why is it useless ? When I was saying it was useless, I was talking about my bind decorator of course! It's useless because the above can be written def foo(x=40, y=2): return x+y It's inefficient because it works by deconstructing and reconstructing the function bytecode. If I have the time I will post an implementation of your localize decorator in CPython later (I think it would be easy, if one ignores nonlocal variables in nested functions). -- Arnaud From george.sakkis at gmail.com Mon Oct 13 20:24:27 2008 From: george.sakkis at gmail.com (George Sakkis) Date: Mon, 13 Oct 2008 14:24:27 -0400 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <9BFBA567-9346-4376-94AB-EE38E817EBFD@googlemail.com> References: <48E5F5F3.3040000@canterbury.ac.nz> <48E6A1EF.8000809@cs.byu.edu> <48E6DB55.9010902@canterbury.ac.nz> <91ad5bf80810121724h9cb3365w787fbad8a98d459@mail.gmail.com> <9bfc700a0810130509t4657fdc0rcab12153416b7534@mail.gmail.com> <91ad5bf80810130722o58891e0al2bf4757984aa0c81@mail.gmail.com> <9BFBA567-9346-4376-94AB-EE38E817EBFD@googlemail.com> Message-ID: <91ad5bf80810131124k687c8ecflfc656c3d1933d246@mail.gmail.com> On Mon, Oct 13, 2008 at 2:05 PM, Arnaud Delobelle wrote: > > On 13 Oct 2008, at 15:22, George Sakkis wrote: > > On Mon, Oct 13, 2008 at 8:09 AM, Arnaud Delobelle >> wrote: >> >> > Since this idea didn't get much steam, a more modest proposal would be >> to >> > relax the restriction on cells: allow the creation of new cells and the >> > rebinding of func_closure in pure Python. Then one could explicitly >> create a >> > new scope without any other change in the language through a 'localize' >> > decorator that would create a new cell for every free variable (i.e. >> global >> > or value from an enclosing scope) of the function: >> > >> > lst = [] >> > for i in range(10): >> > @localize >> > def f(): print i >> > lst.append(f) >> > lst.append(localize(lambda: i**2)) >> > >> > I'd love to be proven wrong but I don't think localize() can be >> implemented >> > in current Python. >> >> I think you probably can in CPython, but that would involve bytecode >> introspection and using ctypes.pythonapi.PyCell_New, and it would be >> terribly inefficient. I wrote a similar decorator that takes a >> function and bind some of its variables to some values. e.g >> >> @bind(x=40, y=2) >> def foo(): return x+y >> >> >>> foo() >> 42 >> >> It's useless of course. >> >> Can you expand a bit on that ? Why would it be terribly inefficient and >> why is it useless ? >> > > When I was saying it was useless, I was talking about my bind decorator of > course! > > It's useless because the above can be written > > def foo(x=40, y=2): return x+y But the whole point of this thread is that this is an abuse of default arguments since it changes f's signature; f(5,6), f(1), f(y=10) should all raise TypeError. It's inefficient because it works by deconstructing and reconstructing the > function bytecode. But this happens only once at decoration time, not every time f is called, right ? George -------------- next part -------------- An HTML attachment was scrubbed... URL: From leif.walsh at gmail.com Mon Oct 13 21:19:59 2008 From: leif.walsh at gmail.com (Leif Walsh) Date: Mon, 13 Oct 2008 15:19:59 -0400 Subject: [Python-ideas] Where/how to propose an addition to a standard module? In-Reply-To: References: <89e81dd6-a623-4d86-bf50-b0c3220de71c@y29g2000hsf.googlegroups.com> <6FCB9EB6-4E1F-44EE-B670-538634977BB1@strout.net> Message-ID: On Mon, Oct 13, 2008 at 12:51 PM, Steven Bethard wrote: > If I were proposing something like this, I'd be using the new > formatting syntax that's supposed to become the standard in Python > 3.0: > > http://docs.python.org/dev/3.0/library/string.html#format-string-syntax > > That would mean something like:: > > '{name} was born in {country}' I agree, but it seems something more is needed here. I would like to be able to parse something that isn't separated by whitespace, and I'd like to be able to tell Python to turn it into an int if I need to. We could keep a similar syntax to the 3.0 formatting syntax, and just change the semantics: ""First, thou shalt count to {n!i}" (or {n!d}) could parse an integer out of the string, and "{!x:[0-9A-Fa-f]*}" + (":{!x:[0-9A-Fa-f]*}" * 7) could be used to parse MAC addresses into a list of 8 hex values. I could easily see this getting way too complex, so maybe you all should just ignore me. -- Cheers, Leif From arnodel at googlemail.com Mon Oct 13 21:55:28 2008 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Mon, 13 Oct 2008 20:55:28 +0100 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <91ad5bf80810121724h9cb3365w787fbad8a98d459@mail.gmail.com> References: <48E5F5F3.3040000@canterbury.ac.nz> <48E6A1EF.8000809@cs.byu.edu> <48E6DB55.9010902@canterbury.ac.nz> <91ad5bf80810121724h9cb3365w787fbad8a98d459@mail.gmail.com> Message-ID: <35612761-585F-4DB4-94BC-8505FCAE6CC0@googlemail.com> On 13 Oct 2008, at 01:24, George Sakkis wrote: > Since this idea didn't get much steam, a more modest proposal would > be to relax the restriction on cells: allow the creation of new > cells and the rebinding of func_closure in pure Python. Then one > could explicitly create a new scope without any other change in the > language through a 'localize' decorator that would create a new cell > for every free variable (i.e. global or value from an enclosing > scope) of the function: > > lst = [] > for i in range(10): > @localize > def f(): print i > lst.append(f) > lst.append(localize(lambda: i**2)) > > I'd love to be proven wrong but I don't think localize() can be > implemented in current Python. Here is a (very) quick and dirty implementation in CPython (requires ctypes). I'm sure it breaks in all sorts of ways but I don't have more time to test it :) Tests follow the implementation. ------------------------- localize.py --------------------- from itertools import * import sys import ctypes from array import array from opcode import opmap, HAVE_ARGUMENT new_cell = ctypes.pythonapi.PyCell_New new_cell.restype = ctypes.py_object new_cell.argtypes = [ctypes.py_object] from types import CodeType, FunctionType LOAD_GLOBAL = opmap['LOAD_GLOBAL'] LOAD_DEREF = opmap['LOAD_DEREF'] LOAD_FAST = opmap['LOAD_FAST'] STORE_GLOBAL = opmap['STORE_GLOBAL'] STORE_DEREF = opmap['STORE_DEREF'] STORE_FAST = opmap['STORE_FAST'] code_args = ( 'argcount', 'nlocals', 'stacksize', 'flags', 'code', 'consts', 'names', 'varnames', 'filename', 'name', 'firstlineno', 'lnotab', 'freevars', 'cellvars' ) def copy_code(code_obj, **kwargs): "Make a copy of a code object, maybe changing some attributes" for arg in code_args: if not kwargs.has_key(arg): kwargs[arg] = getattr(code_obj, 'co_%s' % arg) return CodeType(*map(kwargs.__getitem__, code_args)) def code_walker(code): l = len(code) code = array('B', code) i = 0 while i < l: op = code[i] if op >= HAVE_ARGUMENT: yield op, code[i+1] + (code[i+2] << 8) i += 3 else: yield op, None i += 1 class CodeMaker(object): def __init__(self): self.code = array('B') def append(self, opcode, arg=None): app = self.code.append app(opcode) if arg is not None: app(arg & 0xFF) app(arg >> 8) def getcode(self): return self.code.tostring() def localize(f): if not isinstance(f, FunctionType): return f nonlocal_vars = [] new_cells = [] frame = sys._getframe(1) values = dict(frame.f_globals) values.update(frame.f_locals) co = f.func_code deref = co.co_cellvars + co.co_freevars names = co.co_names varnames = co.co_varnames offset = len(deref) varindex = {} new_code = CodeMaker() # Disable CO_NOFREE in the code object's flags flags = co.co_flags & (0xFFFF - 0x40) # Count the number of arguments of f, including *args & **kwargs argcount = co.co_argcount if flags & 0x04: argcount += 1 if flags & 0x08: argcount += 1 # Change the code object so that the non local variables are # bound to new cells which are initialised to the current value # of the variable with that name in the surrounding frame. for opcode, arg in code_walker(co.co_code): vname = None if opcode in (LOAD_GLOBAL, STORE_GLOBAL): vname = names[arg] elif opcode in (LOAD_DEREF, STORE_DEREF): vname = deref[arg] else: new_code.append(opcode, arg) continue try: vi = varindex[vname] except KeyError: nonlocal_vars.append(vname) new_cells.append(new_cell(values[vname])) vi = varindex[vname] = offset offset += 1 if opcode in (LOAD_GLOBAL, LOAD_DEREF): new_code.append(LOAD_DEREF, vi) else: new_code.append(STORE_DEREF, vi) co = copy_code(co, code=new_code.getcode(), freevars=co.co_freevars + tuple(nonlocal_vars), flags=flags) return FunctionType(co, f.func_globals, f.func_name, f.func_defaults, (f.func_closure or ()) + tuple(new_cells)) ------------------------ /localize.py --------------------- Some examples: >>> y = 3 >>> @localize ... def f(x): ... return x, y ... >>> f(5) (5, 3) >>> y = 1000 >>> f(2) (2, 3) >>> def test(): ... acc = [] ... for i in range(10): ... @localize ... def pr(): print i ... acc.append(pr) ... return acc ... >>> for f in test(): f() ... 0 1 2 3 4 5 6 7 8 9 >>> lambdas = [localize(lambda: i) for i in range(10)] >>> for f in lambdas: print f() ... 0 1 2 3 4 5 6 7 8 9 >>> # Lastly, your example >>> lst = [] >>> for i in range(10): ... @localize ... def f(): print i ... lst.append(f) ... lst.append(localize(lambda: i**2)) ... >>> [f() for f in lst] 0 1 2 3 4 5 6 7 8 9 [None, 0, None, 1, None, 4, None, 9, None, 16, None, 25, None, 36, None, 49, None, 64, None, 81] >>> -- Arnaud From josiah.carlson at gmail.com Mon Oct 13 21:55:38 2008 From: josiah.carlson at gmail.com (Josiah Carlson) Date: Mon, 13 Oct 2008 12:55:38 -0700 Subject: [Python-ideas] Idea: Lazy import statement In-Reply-To: <48F3030A.6050107@canterbury.ac.nz> References: <48F3030A.6050107@canterbury.ac.nz> Message-ID: On Mon, Oct 13, 2008 at 1:12 AM, Greg Ewing wrote: > Problem: You have a package containing a large number of > classes, of which only a few are typically used by any > given application. > > If you put each class into its own submodule, then > client code is required to use a lot of tedious > 'from foo.thingy import Thingy' statements to import > the classes it wants to use. This also makes all the > submodule names part of the API and makes it hard to > rearrange the packaging without breaking code. > > If you try to flatten the namespace by importing all > the classes into the top level module, you end up > importing everything even if it won't be used. > > What's needed is a way of lazily importing them, so > that the import won't actually happen unless the > imported names are referenced. > > It's possible to hack something like that up now, but > then tools such as py2app and py2exe, that try to find > modules by statically examining the source looking for > import statements, won't be able to accurately determine > which modules are used. At best they'll think the > whole package is used and incorporate all of it; at > worst they'll miss it altogether. > > So I think it would be good to have a dedicated syntax > for lazy imports, so the top-level foo package can say > something like > > from foo.thing lazily import Thing > from foo.stuff lazily import Stuff > ... > > Executing a lazy import statement adds an entry to a > list of deferred imports attached to the module. Then, > the first time the imported name is referenced, the > import is performed and the name becomes an ordinary > attribute thereafter. > > If py2exe et al are taught about lazy imports, they > will then be able to determine exactly which submodules > are used by an application and exclude the rest. How is this mechanism supposed to behave in the presence of things like... # baz.py from foo lazy import bar def fcn(): return bar.obj(...) It would seem that py2exe, etc., would need to include the bar module regardless of whether baz.fcn() was called in any particular invocation of a program, and even if baz.fcn() was never called in any invocation. - Josiah From tjreedy at udel.edu Mon Oct 13 22:43:17 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 13 Oct 2008 16:43:17 -0400 Subject: [Python-ideas] Where/how to propose an addition to a standard module? In-Reply-To: References: <89e81dd6-a623-4d86-bf50-b0c3220de71c@y29g2000hsf.googlegroups.com> <6FCB9EB6-4E1F-44EE-B670-538634977BB1@strout.net> Message-ID: Leif Walsh wrote: > On Mon, Oct 13, 2008 at 12:51 PM, Steven Bethard > wrote: >> If I were proposing something like this, I'd be using the new >> formatting syntax that's supposed to become the standard in Python >> 3.0: >> >> http://docs.python.org/dev/3.0/library/string.html#format-string-syntax >> >> That would mean something like:: >> >> '{name} was born in {country}' > > I agree, but it seems something more is needed here. I would like to > be able to parse something that isn't separated by whitespace, and I'd > like to be able to tell Python to turn it into an int if I need to. > We could keep a similar syntax to the 3.0 formatting syntax, and just > change the semantics: Given that I have never used the Template class and am using 3.0 str.format, that I believe the former was the bridge that led to the latter, and am not enamored of writing REs, I think adding str.match would be a great idea. To the extent possible, s == form.format(form.match(s)) and args == form.match(form.format(args). Note that args can either be a sequence or mapping. This might encourage more people to switch faster from % to .format. Given that Guido wants this to happen, he might look favorably > ""First, thou shalt count to {n!i}" (or {n!d}) could parse an integer > out of the string, and "{!x:[0-9A-Fa-f]*}" + (":{!x:[0-9A-Fa-f]*}" * > 7) could be used to parse MAC addresses into a list of 8 hex values. > > I could easily see this getting way too complex, Let us not make the perfect be the enemy of the possible. > so maybe you all should just ignore me. Nope. Terry Jan Reedy > From joe at strout.net Mon Oct 13 23:09:10 2008 From: joe at strout.net (Joe Strout) Date: Mon, 13 Oct 2008 15:09:10 -0600 Subject: [Python-ideas] Template.match or similar (was Re: Where/how to propose...) In-Reply-To: References: <89e81dd6-a623-4d86-bf50-b0c3220de71c@y29g2000hsf.googlegroups.com> <6FCB9EB6-4E1F-44EE-B670-538634977BB1@strout.net> Message-ID: <7C2FB131-4D9F-4B91-8F96-F2C1C14FC5D4@strout.net> On Oct 13, 2008, at 2:43 PM, Terry Reedy wrote: > Given that I have never used the Template class and am using 3.0 > str.format, that I believe the former was the bridge that led to the > latter, and am not enamored of writing REs, I think adding str.match > would be a great idea. To the extent possible, > s == form.format(form.match(s)) and > args == form.match(form.format(args). > Note that args can either be a sequence or mapping. > > This might encourage more people to switch faster from % to .format. > Given that Guido wants this to happen, he might look favorably Well, I'm all for that, but I've never used the 3.0 str.format, and find the documentation on it somewhat unenlightening. Can you mock up an example of how this might work? In simple form it looks very similar to Template.substitute, but it leaves lots of room for getting not-so-simple. Does that present a problem, in that a great number of format strings would not be useable in reverse? Or would we simply say: if you want to use the reverse (match) operation, then you have to restrict yourself to simple named fields? Or, would we define a slightly different syntax for use with match, that lets you specify numeric conversions or whatever, and give up the idea that these are truly inverse operations? Best, - Joe From tjreedy at udel.edu Tue Oct 14 05:50:09 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 13 Oct 2008 23:50:09 -0400 Subject: [Python-ideas] Template.match or similar (was Re: Where/how to propose...) In-Reply-To: <7C2FB131-4D9F-4B91-8F96-F2C1C14FC5D4@strout.net> References: <89e81dd6-a623-4d86-bf50-b0c3220de71c@y29g2000hsf.googlegroups.com> <6FCB9EB6-4E1F-44EE-B670-538634977BB1@strout.net> <7C2FB131-4D9F-4B91-8F96-F2C1C14FC5D4@strout.net> Message-ID: Joe Strout wrote: > On Oct 13, 2008, at 2:43 PM, Terry Reedy wrote: > >> Given that I have never used the Template class and am using 3.0 >> str.format, that I believe the former was the bridge that led to the >> latter, and am not enamored of writing REs, I think adding str.match >> would be a great idea. To the extent possible, >> s == form.format(form.match(s)) and >> args == form.match(form.format(args). >> Note that args can either be a sequence or mapping. >> >> This might encourage more people to switch faster from % to .format. >> Given that Guido wants this to happen, he might look favorably > > Well, I'm all for that, but I've never used the 3.0 str.format, and find > the documentation on it somewhat unenlightening. You probably read too much of it. I probably read it all but only learned the basics, intending to review more on a need-to-know basis. > Can you mock up an example of how this might work? My ideas and opinions. 1. Start with the simplest possible tasks. temp_p1 = 'Testing {0}' temp_k1 = 'Testing {word}' p1 = ('matches',) # or perhaps use list instead of tuple k1 = {'word':'matches'} text1 = 'Testing matches' form_p1 = temp_p1.format(*p1) form_k1 = temp_k1.format(**k1) print(form_p1 == form_k1==text1) #prints True today with 3.0c1 #tests temp_p1.match(text1) == p1 temp_k1.match(text1) == k1 (Come to think of it, easiest would have no literal text, but I already wrote the above.) Now, write the simplest code that makes these tests pass. Easy. Use str.startswith() to match the literal, non-field text. Add text1e='Bad matches' and consider what exception to raise on non-literal match. 2. Now complicate the task. Add text after the substitution field. Write test and then code with str.endswith Still do not need re, but might also want to do re version. 3. Add more fields: two positional, two keyword, one of each. Probably need to 'refactor' and use re. 4,5,6. Add field attributes, int formatting, and float formatting, depending on what translated to re's. > > In simple form it looks very similar to Template.substitute, but it > leaves lots of room for getting not-so-simple. Does that present a > problem, in that a great number of format strings would not be useable > in reverse? Not to me. The most common substitute is straight string interpolation. I suspect that that and int and float field formatting will cover 80% of use cases. For others... 'use the re module'. > Or would we simply say: if you want to use the reverse > (match) operation, then you have to restrict yourself to simple named > fields? If necessary, but I suspect more is reasonably possible. But yes, 'you have to use a subset of possible formats'. Or, would we define a slightly different syntax for use with > match, that lets you specify numeric conversions or whatever, and give > up the idea that these are truly inverse operations? No, not yet another syntax ;-). Terry Jan Reedy From dillonco at comcast.net Tue Oct 14 06:03:03 2008 From: dillonco at comcast.net (Dillon Collins) Date: Tue, 14 Oct 2008 00:03:03 -0400 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <9BFBA567-9346-4376-94AB-EE38E817EBFD@googlemail.com> References: <48E5F5F3.3040000@canterbury.ac.nz> <91ad5bf80810130722o58891e0al2bf4757984aa0c81@mail.gmail.com> <9BFBA567-9346-4376-94AB-EE38E817EBFD@googlemail.com> Message-ID: <200810140003.03535.dillonco@comcast.net> On Monday 13 October 2008, Arnaud Delobelle wrote: > It's inefficient because it works by deconstructing and reconstructing > the function bytecode. That's not necessary. Just make a new closure for it. Here's some code (I was bored/curious). The function reclose_kwds is provided for fun. Enjoy: def cell(v): """Create a cell containing the arg via a dummy function""" def noop(): return v return noop.func_closure[0] def reclose(func, closure): """copy func, but use the given closure""" return function(func.func_code, func.func_globals, func.func_name, func.func_defaults, closure) def reclose_kwds(func, **kwds): """update func's closure using the names/values given as keywords""" cinfo = zip(func.func_code.co_freevars, func.func_closure) closure = tuple(cell(kwds[v]) if v in kwds else c for v,c in cinfo) return reclose(func, closure) def close(*names): """lock the given (non-global) variable names to their current values for function""" def _close(func): cinfo = zip(func.func_code.co_freevars, func.func_closure) closure = tuple(cell(c.cell_contents) if v in names else c for v,c in cinfo) return reclose(func, closure) return _close def close_all(func): """lock all non-global variables in function to their current values""" closure = tuple(cell(c.cell_contents) for c in func.func_closure) return reclose(func, closure) def g(): j=1 def f(): ret = [] for i in range(3): #q=lambda x:x*i*j def q(x): return x*i*j ret.append(q) return ret return f() def g2(): j=1 def f(): ret = [] for i in range(3): #q=close('i')(lambda x:x*i*j) @close('i') def q(x): return x*i*j ret.append(q) return ret return f() q1, q2, q3 = g() p1, p2, p3 = g2() print q1, q1(2) print q2, q2(2) print q3, q3(2) print print p1, p1(2) print p2, p2(2) print p3, p3(2) print From greg.ewing at canterbury.ac.nz Tue Oct 14 06:56:53 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 14 Oct 2008 17:56:53 +1300 Subject: [Python-ideas] Idea: Lazy import statement In-Reply-To: References: <48F3030A.6050107@canterbury.ac.nz> Message-ID: <48F42695.4020509@canterbury.ac.nz> Josiah Carlson wrote: > How is this mechanism supposed to behave in the presence of things like... > > # baz.py > from foo lazy import bar > > def fcn(): > return bar.obj(...) I would say "don't do that". If bar is always used within the baz module, there's no reason to lazily import it -- just use an ordinary import. If you're trying to achieve a conditional import, use an ordinary import that's executed conditionally, e.g. if moon_is_full: from foo import bar frobnicate(bar) Then py2exe will notice the potential for importing the foo module and include it. The only time you should use a lazy import is in the situation I described, i.e. you're importing things solely to re-export them for other modules that may want to use them. Any module that actually uses something itself, even if only potentially, should use a normal import. -- Greg From arnodel at googlemail.com Tue Oct 14 08:02:40 2008 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Tue, 14 Oct 2008 07:02:40 +0100 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <200810140003.03535.dillonco@comcast.net> References: <48E5F5F3.3040000@canterbury.ac.nz> <91ad5bf80810130722o58891e0al2bf4757984aa0c81@mail.gmail.com> <9BFBA567-9346-4376-94AB-EE38E817EBFD@googlemail.com> <200810140003.03535.dillonco@comcast.net> Message-ID: <4B2FC5B1-792A-45C0-9174-B84142C510A6@googlemail.com> On 14 Oct 2008, at 05:03, Dillon Collins wrote: > On Monday 13 October 2008, Arnaud Delobelle wrote: >> It's inefficient because it works by deconstructing and >> reconstructing >> the function bytecode. > > That's not necessary. Just make a new closure for it. Here's some > code (I > was bored/curious). The function reclose_kwds is provided for > fun. Enjoy: But that doesn't work with global variables, does it? Globals are the reason why I had to scan the bytecode -- Arnaud From andrew-pythonideas at puzzling.org Tue Oct 14 08:04:12 2008 From: andrew-pythonideas at puzzling.org (Andrew Bennetts) Date: Tue, 14 Oct 2008 17:04:12 +1100 Subject: [Python-ideas] Idea: Lazy import statement In-Reply-To: <48F3030A.6050107@canterbury.ac.nz> References: <48F3030A.6050107@canterbury.ac.nz> Message-ID: <20081014060412.GB27079@steerpike.home.puzzling.org> Greg Ewing wrote: > Problem: You have a package containing a large number of > classes, of which only a few are typically used by any > given application. [...] > > So I think it would be good to have a dedicated syntax > for lazy imports, so the top-level foo package can say > something like Yes, this would be good to have. There's clearly a need; Bazaar and Mercurial and probably others have invented versions of this. It would be excellent to have a standard way to spell it. -Andrew. From theller at ctypes.org Tue Oct 14 09:27:48 2008 From: theller at ctypes.org (Thomas Heller) Date: Tue, 14 Oct 2008 09:27:48 +0200 Subject: [Python-ideas] Idea: Lazy import statement In-Reply-To: <20081014060412.GB27079@steerpike.home.puzzling.org> References: <48F3030A.6050107@canterbury.ac.nz> <20081014060412.GB27079@steerpike.home.puzzling.org> Message-ID: Andrew Bennetts schrieb: > Greg Ewing wrote: >> Problem: You have a package containing a large number of >> classes, of which only a few are typically used by any >> given application. > [...] >> >> So I think it would be good to have a dedicated syntax >> for lazy imports, so the top-level foo package can say >> something like > > Yes, this would be good to have. There's clearly a need; Bazaar and > Mercurial and probably others have invented versions of this. It would > be excellent to have a standard way to spell it. How do these invented versions look like? -- Thanks, Thomas From dillonco at comcast.net Tue Oct 14 13:25:11 2008 From: dillonco at comcast.net (Dillon Collins) Date: Tue, 14 Oct 2008 07:25:11 -0400 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <4B2FC5B1-792A-45C0-9174-B84142C510A6@googlemail.com> References: <48E5F5F3.3040000@canterbury.ac.nz> <200810140003.03535.dillonco@comcast.net> <4B2FC5B1-792A-45C0-9174-B84142C510A6@googlemail.com> Message-ID: <200810140725.11383.dillonco@comcast.net> On Tuesday 14 October 2008, Arnaud Delobelle wrote: > On 14 Oct 2008, at 05:03, Dillon Collins wrote: > > On Monday 13 October 2008, Arnaud Delobelle wrote: > >> It's inefficient because it works by deconstructing and > >> reconstructing > >> the function bytecode. > > > > That's not necessary. Just make a new closure for it. Here's some > > code (I > > was bored/curious). The function reclose_kwds is provided for > > fun. Enjoy: > > But that doesn't work with global variables, does it? Globals are the > reason why I had to scan the bytecode Nope. However, I expect that globals would be even easier. If you want to freeze all the variables, you can just replace func_globals with func_globals.copy(). Otherwise, you can replace it with some proxy object that would read from your dict and fall back to the real globals if necessary. (Probably subclass dict with the __missing__ method.) Or at least I think that would work. I don't have time to try it now... From andrew-pythonideas at puzzling.org Tue Oct 14 15:40:36 2008 From: andrew-pythonideas at puzzling.org (Andrew Bennetts) Date: Wed, 15 Oct 2008 00:40:36 +1100 Subject: [Python-ideas] Idea: Lazy import statement In-Reply-To: References: <48F3030A.6050107@canterbury.ac.nz> <20081014060412.GB27079@steerpike.home.puzzling.org> Message-ID: <20081014134036.GA23248@steerpike.home.puzzling.org> Thomas Heller wrote: > Andrew Bennetts schrieb: [...] > > Yes, this would be good to have. There's clearly a need; Bazaar and > > Mercurial and probably others have invented versions of this. It would > > be excellent to have a standard way to spell it. > > How do these invented versions look like? -Andrew. From phd at phd.pp.ru Tue Oct 14 16:08:56 2008 From: phd at phd.pp.ru (Oleg Broytmann) Date: Tue, 14 Oct 2008 18:08:56 +0400 Subject: [Python-ideas] Idea: Lazy import statement In-Reply-To: <20081014134036.GA23248@steerpike.home.puzzling.org> References: <48F3030A.6050107@canterbury.ac.nz> <20081014060412.GB27079@steerpike.home.puzzling.org> <20081014134036.GA23248@steerpike.home.puzzling.org> Message-ID: <20081014140856.GA26641@phd.pp.ru> On Wed, Oct 15, 2008 at 12:40:36AM +1100, Andrew Bennetts wrote: > Thomas Heller wrote: > > Andrew Bennetts schrieb: > [...] > > > Yes, this would be good to have. There's clearly a need; Bazaar and > > > Mercurial and probably others have invented versions of this. It would > > > be excellent to have a standard way to spell it. > > > > How do these invented versions look like? > > > > See also mx.Misc.LazyModule: http://www.koders.com/python/fid9565A91C21012C73AF249134CA058DEE0031AACB.aspx?s=cdef%3Aparser Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From skip at pobox.com Tue Oct 14 18:17:28 2008 From: skip at pobox.com (skip at pobox.com) Date: Tue, 14 Oct 2008 11:17:28 -0500 Subject: [Python-ideas] Optimistic Concurrency Message-ID: <18676.50712.900520.708841@montanaro-dyndns-org.local> Is optimistic concurrency http://en.wikipedia.org/wiki/Optimistic_concurrency_control a possible option for removing the GIL in Python? Skip From tjreedy at udel.edu Tue Oct 14 19:38:31 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 14 Oct 2008 13:38:31 -0400 Subject: [Python-ideas] Optimistic Concurrency In-Reply-To: <18676.50712.900520.708841@montanaro-dyndns-org.local> References: <18676.50712.900520.708841@montanaro-dyndns-org.local> Message-ID: skip at pobox.com wrote: > Is optimistic concurrency > > http://en.wikipedia.org/wiki/Optimistic concurrency_control This is designed for humans using client programs to edit records in shared databases with a low collision rate at the record level. > a possible option for removing the GIL in Python? It does not strike me as practical for Python. 1. It starts with copying data. Clients accessing a database server already do this. Threads accessing *shared* data normally do not. 2. Any edit (change) may be discarded in part or in whole. Human operators, informed of the rejection, must (and 'somehow' do) decide what to do. Wrapping every assignment to shared data with a pre-programmed RejectionException handler would usually be a huge burden on the programmer. From leif.walsh at gmail.com Tue Oct 14 21:14:57 2008 From: leif.walsh at gmail.com (Leif Walsh) Date: Tue, 14 Oct 2008 15:14:57 -0400 Subject: [Python-ideas] Optimistic Concurrency In-Reply-To: References: <18676.50712.900520.708841@montanaro-dyndns-org.local> Message-ID: On Tue, Oct 14, 2008 at 1:38 PM, Terry Reedy wrote: > It does not strike me as practical for Python. Probably true, but let's see.... > 1. It starts with copying data. Clients accessing a database server already > do this. Threads accessing *shared* data normally do not. Agreed, but if I want them to, they should (and I should be able to tell that to python concisely, and have python understand what that means for concurrency). > 2. Any edit (change) may be discarded in part or in whole. Human operators, > informed of the rejection, must (and 'somehow' do) decide what to do. > Wrapping every assignment to shared data with a pre-programmed > RejectionException handler would usually be a huge burden on the programmer. It would be a huge burden, but perhaps it could be an option for the especially ambitious programmer. Might there be a way to tell the interpreter, "hey buddy, I've written all the RejectionException handlers, why don't you just let go of the GIL for now and use this other thing instead"? I could see this becoming very difficult if there were some kind of on/off switch you could trigger from inside python (what happens when _that_ has a race condition? O_o), but if you could have it on a per-program-run basis, it might be useful. Certainly, you could then go and exec() an program that does this, and break it that way, but hopefully, if you understand enough to use the switch in the first place, you'd understand how bad of an idea this would be. For a long time, I've wanted to see a strong guarantee of concurrency-invariance in python, especially when dealing with swigged modules, which have a tendency to blunder right on past the GIL. I think having a secondary mode that allows this might be the slow, gentle path we need. After all, I nearly dropped python altogether when I discovered Erlang (before I discovered its syntax, of course), for just this reason. -- Cheers, Leif From rhamph at gmail.com Tue Oct 14 21:32:36 2008 From: rhamph at gmail.com (Adam Olsen) Date: Tue, 14 Oct 2008 13:32:36 -0600 Subject: [Python-ideas] Optimistic Concurrency In-Reply-To: <18676.50712.900520.708841@montanaro-dyndns-org.local> References: <18676.50712.900520.708841@montanaro-dyndns-org.local> Message-ID: On Tue, Oct 14, 2008 at 10:17 AM, wrote: > Is optimistic concurrency > > http://en.wikipedia.org/wiki/Optimistic_concurrency_control > > a possible option for removing the GIL in Python? Safethread's use of monitors eliminates data contention. The bigger problem is refcounting, which optimistic concurrency doesn't help. -- Adam Olsen, aka Rhamphoryncus From jimjjewett at gmail.com Tue Oct 14 22:10:31 2008 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 14 Oct 2008 16:10:31 -0400 Subject: [Python-ideas] Idea: Lazy ... statement Message-ID: On Mon, Oct 13, 2008 at 4:12 AM, Greg Ewing wrote: > So I think it would be good to have a dedicated syntax > for lazy imports, so the top-level foo package can say > something like > from foo.thing lazily import Thing > from foo.stuff lazily import Stuff Since this can be (awkwardly) done now, I don't think the problem is big enough for another keyword. On the other hand, a "lazy" keyword that worked in other contexts might have some merit. That would solve at least one major objection to setdefault-style functions. def setdefault(self, k, lazy d=None): ... or even lazy def yikes(msg): """Something went very wrong; report for debugging""" import logging ...logging config setup... ... Note that I'm not yet ready to volunteer on fleshing out the details, let alone how to implement it efficiently. (Would it require the moral equivalent of quasiquoting?) -jJ From skip at pobox.com Tue Oct 14 22:13:37 2008 From: skip at pobox.com (skip at pobox.com) Date: Tue, 14 Oct 2008 15:13:37 -0500 Subject: [Python-ideas] Optimistic Concurrency In-Reply-To: References: <18676.50712.900520.708841@montanaro-dyndns-org.local> Message-ID: <18676.64881.527555.648490@montanaro-dyndns-org.local> >>>>> "Adam" == Adam Olsen writes: Adam> Safethread's use of monitors eliminates data contention. The Adam> bigger problem is refcounting, which optimistic concurrency Adam> doesn't help. Thanks. I see Safethread hasn't had a release in awhile. Is it still up-to-date w.r.t. Subversion trunk? Skip From pelotoescogorciao at yahoo.es Tue Oct 14 22:31:24 2008 From: pelotoescogorciao at yahoo.es (Victor Martin Ulloa) Date: Tue, 14 Oct 2008 20:31:24 +0000 (GMT) Subject: [Python-ideas] VS2005 project improvement Message-ID: <903070.98567.qm@web25802.mail.ukl.yahoo.com> I'm trying to comppile python 2.5 as a static library using VS2005 SP1. I realized you only have Debug, PG and Release configurations. Of course, like you provide the sources, I could modify it manually... but I think will be much better to provide the Debug_Static and Release_Static configurations for noob user like me :P Also, you are using this code in the pyconfig.h: /* For an MSVC DLL, we can nominate the .lib files used by extensions */ #ifdef MS_COREDLL # ifndef Py_BUILD_CORE /* not building the core - must be an ext */ # if defined(_MSC_VER) /* So MSVC users need not specify the .lib file in their Makefile (other compilers are generally taken care of by distutils.) */ # ifdef _DEBUG # pragma comment(lib,"python25_d.lib") # else # pragma comment(lib,"python25.lib") # endif /* _DEBUG */ # endif /* _MSC_VER */ # endif /* Py_BUILD_CORE */ #endif /* MS_COREDLL */ This does not allow the user to rename the output library ( for example, to pytoncore_static_debug.lib ). It would be very desireable to allow the user to change the python library output name... and use these names as defaults: python25_md.lib -> python 2.5, multithread debug C CRT python25_mdd.libl -> python 2.5, multithread debug DLL C CRT python25_static_debug.lib -> python 2.5, multithread debug static library C CRT python25_static.lib -> python 2.5, multithread static library C CRT On the other hand, I see the python 3.0rc1 solution has been saved using VS2008. I think that's bad, because VS2005 users won't be able to open the solution. Ideally, you should provide a python solution for each Visual Studio: PCBuild_VC6 PCBuild_VC2002 PCBuild_VC2003 PCBuild_VC2005 PCBuild_VC2008 or provide just the VC6 solution that can be easily converted by all the modern Visual Studio versions. thanks. From grosser.meister.morti at gmx.net Tue Oct 14 22:39:56 2008 From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=) Date: Tue, 14 Oct 2008 22:39:56 +0200 Subject: [Python-ideas] Idea: Lazy ... statement In-Reply-To: References: Message-ID: <48F5039C.4050908@gmx.net> Jim Jewett schrieb: > On Mon, Oct 13, 2008 at 4:12 AM, Greg Ewing wrote: > >> So I think it would be good to have a dedicated syntax >> for lazy imports, so the top-level foo package can say >> something like > >> from foo.thing lazily import Thing >> from foo.stuff lazily import Stuff > > Since this can be (awkwardly) done now, I don't think the problem is > big enough for another keyword. On the other hand, a "lazy" keyword > that worked in other contexts might have some merit. > > That would solve at least one major objection to setdefault-style functions. > > def setdefault(self, k, lazy d=None): ... > > or even > > lazy def yikes(msg): > """Something went very wrong; report for debugging""" > import logging > ...logging config setup... > ... > > Note that I'm not yet ready to volunteer on fleshing out the details, > let alone how to implement it efficiently. (Would it require the > moral equivalent of quasiquoting?) > > -jJ I don't understand what this lazy keyword should do? -panzi From rhamph at gmail.com Tue Oct 14 23:12:29 2008 From: rhamph at gmail.com (Adam Olsen) Date: Tue, 14 Oct 2008 15:12:29 -0600 Subject: [Python-ideas] Optimistic Concurrency In-Reply-To: <18676.64881.527555.648490@montanaro-dyndns-org.local> References: <18676.50712.900520.708841@montanaro-dyndns-org.local> <18676.64881.527555.648490@montanaro-dyndns-org.local> Message-ID: On Tue, Oct 14, 2008 at 2:13 PM, wrote: >>>>>> "Adam" == Adam Olsen writes: > > Adam> Safethread's use of monitors eliminates data contention. The > Adam> bigger problem is refcounting, which optimistic concurrency > Adam> doesn't help. > > Thanks. I see Safethread hasn't had a release in awhile. Is it still > up-to-date w.r.t. Subversion trunk? Development has stalled due to lack of interest. It's not up-to-date, but the current plan is to strip the GIL removal changes out of it (leaving them in a separate branch), which should make it much easier to update. -- Adam Olsen, aka Rhamphoryncus From greg.ewing at canterbury.ac.nz Wed Oct 15 01:53:25 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 15 Oct 2008 12:53:25 +1300 Subject: [Python-ideas] Idea: Lazy ... statement In-Reply-To: References: Message-ID: <48F530F5.6090309@canterbury.ac.nz> Jim Jewett wrote: > Since this can be (awkwardly) done now, I don't think the problem is > big enough for another keyword. Yes, it can be done, but not without confounding static analysis tools. That's the reason I'm suggesting a dedicated syntax. An alternative would be to establish a convention for expressing it that static tools could recognize, although it's hard to think of something that wouldn't look ugly. > def setdefault(self, k, lazy d=None): ... > > or even > > lazy def yikes(msg): I'm calling hypergeneralization on these -- they would be doing very different things from what I'm suggesting. -- Greg From jimjjewett at gmail.com Wed Oct 15 02:25:20 2008 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 14 Oct 2008 20:25:20 -0400 Subject: [Python-ideas] Idea: Lazy ... statement In-Reply-To: <48F5039C.4050908@gmx.net> References: <48F5039C.4050908@gmx.net> Message-ID: On Tue, Oct 14, 2008 at 4:39 PM, Mathias Panzenb?ck wrote: > Jim Jewett schrieb: >> On Mon, Oct 13, 2008 at 4:12 AM, Greg Ewing wrote: >> Since this can be (awkwardly) done now, I don't think the problem is >> big enough for another keyword. On the other hand, a "lazy" keyword >> that worked in other contexts might have some merit. >> That would solve at least one major objection to setdefault-style functions. >> def setdefault(self, k, lazy d=None): ... > I don't understand what this lazy keyword should do? "Hey, this next thing that you were about to execute? Don't do it yet." Part of the problem with setdefault is that calculating the default can be expensive. profile.setdefault(user, askuser()) This should get the user's stored profile; if there is no stored setting, it should ask the user and save the result for future reference. Unfortunately, askuser() gets evaluated before setdefault is called, so the user is *always* asked, "just in case". profile.setdefault(user, lazy askuser()) would say not to bother the user *unless* the default were actually needed. (Just as the lazy import wouldn't bother setting up the other module unless/until it were actually needed.) lazyness can seem pretty important in some programming styles, but ... those styles don't seem to be such a good fit for python anyhow. Whether there are enough use cases in typical python ... I'm not sure. (But I'm pretty sure that import alone doesn't make it.) -jJ From prouleau001 at gmail.com Wed Oct 15 03:43:15 2008 From: prouleau001 at gmail.com (Pierre Rouleau) Date: Tue, 14 Oct 2008 21:43:15 -0400 Subject: [Python-ideas] VS2005 project improvement In-Reply-To: <903070.98567.qm@web25802.mail.ukl.yahoo.com> References: <903070.98567.qm@web25802.mail.ukl.yahoo.com> Message-ID: <5c1d522d0810141843r47420a47sba8bf6aee87c097e@mail.gmail.com> On Tue, Oct 14, 2008 at 4:31 PM, Victor Martin Ulloa wrote: > On the other hand, I see the python 3.0rc1 solution has been saved using VS2008. I think that's bad, because VS2005 users won't be able to open the solution. Ideally, you should provide a python solution for each Visual Studio: > > PCBuild_VC6 > PCBuild_VC2002 > PCBuild_VC2003 > PCBuild_VC2005 > PCBuild_VC2008 > > or provide just the VC6 solution that can be easily converted by all the modern Visual Studio versions. > FWIW, for multiple visual studio projects since VS7, I create independent solutions for each version of Visual Studio for my own projects and use a name that embeds the name of the compiler. I use .vs6, .vs7, .vs8 and .vs9 to identify the version of Visual Studio. For example, the project.vs6.dsp, project.vs9.vcproj, solution.vs6.dsw, solution.vs7.sln, solution.vs9.sln. This means duplication but also freedom between projects/solutions for different version of the tools. Just my 2cts. -- Pierre Rouleau From bruce at leapyear.org Wed Oct 15 09:29:10 2008 From: bruce at leapyear.org (Bruce Leban) Date: Wed, 15 Oct 2008 00:29:10 -0700 Subject: [Python-ideas] Idea: Lazy ... statement In-Reply-To: References: <48F5039C.4050908@gmx.net> Message-ID: The title of this thread is lazy... statement but it seems to me that expressions are the natural unit. If I want to do lazy evaluation today, I would use something like f(a, b, lambda: g(a,b)) where of course the g(a,b) is only evaluated when f wants to evaluate it. Of course f is responsible for explicitly evaluating that lambda. I think a lambda-like syntax like this: f(a, b, lazy: g(a,b)) would be easy to understand. Note that this puts the responsibility on the caller's side to say that the expression is lazy. That is, we don't do def f(a, b, lazy: d): or lazy: def(a, b, d): although this is allowed: def f(a, b, d=lazy: g()) There are several reasons I put the responsibility on the caller's side: (1) The caller is in the best position to know if evaluating a lazy expression is expensive enough that it's worth making lazy, and if making an expression lazy changes the order of evaluation in a bad way. (2) When we see a function call, we don't know for sure which function will be called so we'd have to compile both the inline and the lazy evaluation for each parameter. I'm not sure what this makes for the import case. --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From arnodel at googlemail.com Wed Oct 15 10:46:28 2008 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Wed, 15 Oct 2008 09:46:28 +0100 Subject: [Python-ideas] For-loop variable scope: simultaneous possession and ingestion of cake In-Reply-To: <200810140725.11383.dillonco@comcast.net> References: <48E5F5F3.3040000@canterbury.ac.nz> <200810140003.03535.dillonco@comcast.net> <4B2FC5B1-792A-45C0-9174-B84142C510A6@googlemail.com> <200810140725.11383.dillonco@comcast.net> Message-ID: <9bfc700a0810150146m1d47e426pff64fde5da24de41@mail.gmail.com> 2008/10/14 Dillon Collins : > On Tuesday 14 October 2008, Arnaud Delobelle wrote: >> On 14 Oct 2008, at 05:03, Dillon Collins wrote: >> > On Monday 13 October 2008, Arnaud Delobelle wrote: >> >> It's inefficient because it works by deconstructing and >> >> reconstructing >> >> the function bytecode. >> > >> > That's not necessary. Just make a new closure for it. Here's some >> > code (I >> > was bored/curious). The function reclose_kwds is provided for >> > fun. Enjoy: >> >> But that doesn't work with global variables, does it? Globals are the >> reason why I had to scan the bytecode > > Nope. However, I expect that globals would be even easier. If you want to > freeze all the variables, you can just replace func_globals with > func_globals.copy(). Otherwise, you can replace it with some proxy object > that would read from your dict and fall back to the real globals if > necessary. (Probably subclass dict with the __missing__ method.) I guess you're right. My version was just an adaptation of the 'bind' decorator that I mentioned above, where only *some* non local variables were 'frozen', so I had to change the bytecode for that. I tried just to adapt it but it was the wrong approach! Anyway here is another version, without using bytecode introspection or ctypes: def new_closure(vals): args = ','.join('x%i' % i for i in range(len(vals))) f = eval("lambda %s:lambda:(%s)" % (args, args)) return f(*vals).func_closure def localize(f): f_globals = dict((n, f.func_globals[n]) for n in f.func_code.co_names) f_closure = ( f.func_closure and new_closure([c.cell_contents for c in f.func_closure]) ) return type(f)(f.func_code, f_globals, f.func_name, f.func_defaults, f_closure) -- Arnaud From lists at cheimes.de Wed Oct 15 12:49:09 2008 From: lists at cheimes.de (Christian Heimes) Date: Wed, 15 Oct 2008 12:49:09 +0200 Subject: [Python-ideas] VS2005 project improvement In-Reply-To: <903070.98567.qm@web25802.mail.ukl.yahoo.com> References: <903070.98567.qm@web25802.mail.ukl.yahoo.com> Message-ID: Victor Martin Ulloa wrote: > or provide just the VC6 solution that can be easily converted by all the modern Visual Studio versions. I see project files for VS6, VC7.1 and VS8 in the directory http://svn.python.org/projects/python/trunk/PC/ :) The files are all more or less maintained. The PCbuild\ directory contains a little script vs9to8.py that converts a set of VS9 project files to VS8. Have fun! Christian From george.sakkis at gmail.com Wed Oct 15 16:57:14 2008 From: george.sakkis at gmail.com (George Sakkis) Date: Wed, 15 Oct 2008 10:57:14 -0400 Subject: [Python-ideas] Automatic total ordering Message-ID: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> Now that 3.x fixes the arbitrary object comparison wart and drops (?) __cmp__, it seems it's a good time to do something with the missing rich comparators gotcha, e.g. given a class that defines __eq__ and __lt__, automatically provide the rest missing comparisons. Yes, it can be done with a custom metaclass or (in 2.6+) with a class decorator [1] but (a) 99% of the time that's what one expects so "explicit is better than implicit" doesn't count and (b) a bulitin implementation might well be more efficient. There might be arguments why this would be a bad idea but I really can't think of any. George [1] http://www.voidspace.org.uk/python/weblog/arch_d7_2008_10_04.shtml -------------- next part -------------- An HTML attachment was scrubbed... URL: From andre.roberge at gmail.com Wed Oct 15 16:58:44 2008 From: andre.roberge at gmail.com (Andre Roberge) Date: Wed, 15 Oct 2008 11:58:44 -0300 Subject: [Python-ideas] Automatic total ordering In-Reply-To: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> Message-ID: <7528bcdd0810150758y40c3f1e3q7d3d6f991231966a@mail.gmail.com> On Wed, Oct 15, 2008 at 11:57 AM, George Sakkis wrote: > Now that 3.x fixes the arbitrary object comparison wart and drops (?) > __cmp__, it seems it's a good time to do something with the missing rich > comparators gotcha, e.g. given a class that defines __eq__ and __lt__, > automatically provide the rest missing comparisons. Yes, it can be done with > a custom metaclass or (in 2.6+) with a class decorator [1] but (a) 99% of > the time that's what one expects so "explicit is better than implicit" > doesn't count and (b) a bulitin implementation might well be more efficient. > There might be arguments why this would be a bad idea but I really can't > think of any. > +1 Andr? > George > > [1] http://www.voidspace.org.uk/python/weblog/arch_d7_2008_10_04.shtml > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > From terry at jon.es Wed Oct 15 17:11:46 2008 From: terry at jon.es (Terry Jones) Date: Wed, 15 Oct 2008 17:11:46 +0200 Subject: [Python-ideas] Automatic total ordering In-Reply-To: Your message at 10:57:14 on Wednesday, 15 October 2008 References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> Message-ID: <18678.2098.394308.964574@jon.es> >>>>> "George" == George Sakkis writes: George> Now that 3.x fixes the arbitrary object comparison wart and drops George> (?) __cmp__, it seems it's a good time to do something with the George> missing rich comparators gotcha, e.g. given a class that defines George> __eq__ and __lt__, automatically provide the rest missing George> comparisons. Note http://www.voidspace.org.uk/python/weblog/arch_d7_2008_10_04.shtml#e1018 Regards, Terry From terry at jon.es Wed Oct 15 17:12:28 2008 From: terry at jon.es (Terry Jones) Date: Wed, 15 Oct 2008 17:12:28 +0200 Subject: [Python-ideas] Automatic total ordering In-Reply-To: Your message at 10:57:14 on Wednesday, 15 October 2008 References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> Message-ID: <18678.2140.780623.439911@jon.es> Oops..... didn't see your footnote URL. Sorry! Terry From guido at python.org Wed Oct 15 19:55:02 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Oct 2008 10:55:02 -0700 Subject: [Python-ideas] Automatic total ordering In-Reply-To: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> Message-ID: Sure, but let's aim for 3.1. The goal for 3.0 is stability and getting it released! On Wed, Oct 15, 2008 at 7:57 AM, George Sakkis wrote: > Now that 3.x fixes the arbitrary object comparison wart and drops (?) > __cmp__, it seems it's a good time to do something with the missing rich > comparators gotcha, e.g. given a class that defines __eq__ and __lt__, > automatically provide the rest missing comparisons. Yes, it can be done with > a custom metaclass or (in 2.6+) with a class decorator [1] but (a) 99% of > the time that's what one expects so "explicit is better than implicit" > doesn't count and (b) a bulitin implementation might well be more efficient. > There might be arguments why this would be a bad idea but I really can't > think of any. > > George > > [1] http://www.voidspace.org.uk/python/weblog/arch_d7_2008_10_04.shtml -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tjreedy at udel.edu Wed Oct 15 20:03:40 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 15 Oct 2008 14:03:40 -0400 Subject: [Python-ideas] Automatic total ordering In-Reply-To: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> Message-ID: George Sakkis wrote: > Now that 3.x fixes the arbitrary object comparison wart and drops (?) > __cmp__, An entry for __cmp__ was in the 3.0c1 doc, which confused me. It is now gone in http://docs.python.org/dev/3.0/reference/datamodel.html#special-method-names Since rich comparisons are defined on object and are inherited by all classes, it would be difficult to make them not defined. From guido at python.org Wed Oct 15 20:11:27 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Oct 2008 11:11:27 -0700 Subject: [Python-ideas] Automatic total ordering In-Reply-To: References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> Message-ID: On Wed, Oct 15, 2008 at 11:03 AM, Terry Reedy wrote: > George Sakkis wrote: >> >> Now that 3.x fixes the arbitrary object comparison wart and drops (?) >> __cmp__, > > An entry for __cmp__ was in the 3.0c1 doc, which confused me. > It is now gone in > http://docs.python.org/dev/3.0/reference/datamodel.html#special-method-names > > Since rich comparisons are defined on object and are inherited by all > classes, it would be difficult to make them not defined. I should also note that part of George's proposal has already been implemented: if you define __eq__, you get a complementary __ne__ for free. However it doesn't work the other way around (defining __ne__ doesn't give you __eq__ for free), and there is no similar relationship for the ordering operators. The reason for the freebie is that it's *extremely* unlikely to want to define != as something other than the complement of == (the only use case is IEEE compliant NaNs); however it's pretty common to define non-total orderings (e.g. set inclusion). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From dillonco at comcast.net Wed Oct 15 20:17:31 2008 From: dillonco at comcast.net (Dillon Collins) Date: Wed, 15 Oct 2008 14:17:31 -0400 Subject: [Python-ideas] Automatic total ordering In-Reply-To: References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> Message-ID: <200810151417.31320.dillonco@comcast.net> On Wednesday 15 October 2008, Terry Reedy wrote: > Since rich comparisons are defined on object and are inherited by all > classes, it would be difficult to make them not defined. Well, I believe that the suggestion comes down to having object's rich comparison operators try to use those of it's subclass rather than just throwing an error. Something like: class object(): def __lt__(self, o): raise Error def __eq__(self, o): raise Error def __ne__(self, o): return not self.__eq__(self, o) def __le__(self, o): return self.__lt__(self, o) or self.__eq__(self, o) def __gt__(self, o): return not (self.__lt__(self, o) or self.__eq__(self, o)) def __ge__(self, o): return not self.__lt__(self, o) and self.__eq__(self, o) Of course, if it were to actually be implemented, it would make more sense if it could use any two non complement ops (rather than lt and eq), but that would also make some trouble, I think. BTW, rather than the class decorator, you could just inherit (multiply?) the above class for total ordering as well. From george.sakkis at gmail.com Wed Oct 15 20:21:42 2008 From: george.sakkis at gmail.com (George Sakkis) Date: Wed, 15 Oct 2008 14:21:42 -0400 Subject: [Python-ideas] Automatic total ordering In-Reply-To: References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> Message-ID: <91ad5bf80810151121i75d281c7j518888fe1f6512fc@mail.gmail.com> On Wed, Oct 15, 2008 at 1:55 PM, Guido van Rossum wrote: Sure, but let's aim for 3.1. > > The goal for 3.0 is stability and getting it released! > Definitely, that's why it's posted on python-ideas and not on the python-3k list. > > On Wed, Oct 15, 2008 at 7:57 AM, George Sakkis > wrote: > > Now that 3.x fixes the arbitrary object comparison wart and drops (?) > > __cmp__, it seems it's a good time to do something with the missing rich > > comparators gotcha, e.g. given a class that defines __eq__ and __lt__, > > automatically provide the rest missing comparisons. Yes, it can be done > with > > a custom metaclass or (in 2.6+) with a class decorator [1] but (a) 99% of > > the time that's what one expects so "explicit is better than implicit" > > doesn't count and (b) a bulitin implementation might well be more > efficient. > > There might be arguments why this would be a bad idea but I really can't > > think of any. > > > > George > > > > [1] http://www.voidspace.org.uk/python/weblog/arch_d7_2008_10_04.shtml > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/ > ) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From george.sakkis at gmail.com Wed Oct 15 20:35:32 2008 From: george.sakkis at gmail.com (George Sakkis) Date: Wed, 15 Oct 2008 14:35:32 -0400 Subject: [Python-ideas] Automatic total ordering In-Reply-To: References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> Message-ID: <91ad5bf80810151135w105fa32dn6404cb2a2ff339ec@mail.gmail.com> On Wed, Oct 15, 2008 at 2:11 PM, Guido van Rossum wrote: On Wed, Oct 15, 2008 at 11:03 AM, Terry Reedy wrote: > > George Sakkis wrote: > >> > >> Now that 3.x fixes the arbitrary object comparison wart and drops (?) > >> __cmp__, > > > > An entry for __cmp__ was in the 3.0c1 doc, which confused me. > > It is now gone in > > > http://docs.python.org/dev/3.0/reference/datamodel.html#special-method-names > > > > Since rich comparisons are defined on object and are inherited by all > > classes, it would be difficult to make them not defined. > > I should also note that part of George's proposal has already been > implemented: if you define __eq__, you get a complementary __ne__ for > free. However it doesn't work the other way around (defining __ne__ > doesn't give you __eq__ for free), and there is no similar > relationship for the ordering operators. The reason for the freebie is > that it's *extremely* unlikely to want to define != as something other > than the complement of == (the only use case is IEEE compliant NaNs); > however it's pretty common to define non-total orderings (e.g. set > inclusion). Partial orderings are certainly used, but I believe they are far less common than total ones. Regardless, a partially ordered class has to explicitly define the supported methods with the desired semantics anyway; the proposed change wouldn't make this any harder. George -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Wed Oct 15 20:54:27 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 15 Oct 2008 14:54:27 -0400 Subject: [Python-ideas] Automatic total ordering In-Reply-To: <200810151417.31320.dillonco@comcast.net> References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> <200810151417.31320.dillonco@comcast.net> Message-ID: Dillon Collins wrote: > On Wednesday 15 October 2008, Terry Reedy wrote: >> Since rich comparisons are defined on object and are inherited by all >> classes, it would be difficult to make them not defined. > > Well, I believe that the suggestion comes down to having object's rich > comparison operators try to use those of it's subclass rather than just > throwing an error. > > Something like: > > class object(): > def __lt__(self, o): raise Error > def __eq__(self, o): raise Error > def __ne__(self, o): > return not self.__eq__(self, o) > def __le__(self, o): > return self.__lt__(self, o) or self.__eq__(self, o) > def __gt__(self, o): > return not (self.__lt__(self, o) or self.__eq__(self, o)) > def __ge__(self, o): > return not self.__lt__(self, o) and self.__eq__(self, o) > > Of course, if it were to actually be implemented, it would make more sense if > it could use any two non complement ops (rather than lt and eq), but that > would also make some trouble, I think. > > BTW, rather than the class decorator, you could just inherit (multiply?) the > above class for total ordering as well. I do not understand this response. In 3.0, the actual definitions are the C equivalent of class object(): def __eq__(self,other): return id(self) == id(other) def __ne__(self,other): return id(self) != id(other) def __lt__(self,other): return NotImplemented tjr From tjreedy at udel.edu Wed Oct 15 21:12:46 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 15 Oct 2008 15:12:46 -0400 Subject: [Python-ideas] Automatic total ordering In-Reply-To: References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> Message-ID: Guido van Rossum wrote: > On Wed, Oct 15, 2008 at 11:03 AM, Terry Reedy wrote: >> George Sakkis wrote: >>> Now that 3.x fixes the arbitrary object comparison wart and drops (?) >>> __cmp__, >> An entry for __cmp__ was in the 3.0c1 doc, which confused me. >> It is now gone in >> http://docs.python.org/dev/3.0/reference/datamodel.html#special-method-names >> >> Since rich comparisons are defined on object and are inherited by all >> classes, it would be difficult to make them not defined. > > I should also note that part of George's proposal has already been > implemented: if you define __eq__, you get a complementary __ne__ for > free. However it doesn't work the other way around (defining __ne__ > doesn't give you __eq__ for free), and there is no similar > relationship for the ordering operators. The reason for the freebie is > that it's *extremely* unlikely to want to define != as something other > than the complement of == (the only use case is IEEE compliant NaNs); > however it's pretty common to define non-total orderings (e.g. set > inclusion). You previously said "Sure, but let's aim for 3.1." However, I could interpret the above as saying that we have already done as much as is sensible (except for changing the docs). Or are you merely saying that any other freebies must make sure to respect the possibility of non-total orderings and not accidentally convert such into a (non-consistent) 'total' ordering? There are several possible basis pairs of defined operations. A specification must list which one(s) would work. Terry From guido at python.org Wed Oct 15 21:43:51 2008 From: guido at python.org (Guido van Rossum) Date: Wed, 15 Oct 2008 12:43:51 -0700 Subject: [Python-ideas] Automatic total ordering In-Reply-To: References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> Message-ID: On Wed, Oct 15, 2008 at 12:12 PM, Terry Reedy wrote: > Guido van Rossum wrote: >> >> On Wed, Oct 15, 2008 at 11:03 AM, Terry Reedy wrote: >>> >>> George Sakkis wrote: >>>> >>>> Now that 3.x fixes the arbitrary object comparison wart and drops (?) >>>> __cmp__, >>> >>> An entry for __cmp__ was in the 3.0c1 doc, which confused me. >>> It is now gone in >>> >>> http://docs.python.org/dev/3.0/reference/datamodel.html#special-method-names >>> >>> Since rich comparisons are defined on object and are inherited by all >>> classes, it would be difficult to make them not defined. >> >> I should also note that part of George's proposal has already been >> implemented: if you define __eq__, you get a complementary __ne__ for >> free. However it doesn't work the other way around (defining __ne__ >> doesn't give you __eq__ for free), and there is no similar >> relationship for the ordering operators. The reason for the freebie is >> that it's *extremely* unlikely to want to define != as something other >> than the complement of == (the only use case is IEEE compliant NaNs); >> however it's pretty common to define non-total orderings (e.g. set >> inclusion). > > You previously said "Sure, but let's aim for 3.1." However, I could > interpret the above as saying that we have already done as much as is > sensible (except for changing the docs). I think we've done as much as I am comfortable with doing *by default* (i.e. when inheriting from object). The rest should be provided via mix-ins. But even those mix-ins should wait until 3.1. > Or are you merely saying that any other freebies must make sure to respect > the possibility of non-total orderings and not accidentally convert such > into a (non-consistent) 'total' ordering? > > There are several possible basis pairs of defined operations. A > specification must list which one(s) would work. There could be several different mix-ins that implement a total ordering based on different basis pairs. (Or even a single basis operation -- *in principle* all you need is a '<' operation.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Wed Oct 15 22:19:00 2008 From: python at rcn.com (Raymond Hettinger) Date: Wed, 15 Oct 2008 13:19:00 -0700 Subject: [Python-ideas] Automatic total ordering References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> Message-ID: From: "Guido van Rossum" > I think we've done as much as I am comfortable with doing *by default* > (i.e. when inheriting from object). The rest should be provided via > mix-ins. But even those mix-ins should wait until 3.1. Rather than a mix-in, my preference is for a class decorator that is smart enough to propagate whatever underlying comparison is provided. That way, you can choose to define any of __lt__, __le__, __gt__, or __ge__ to get all the rest. We did something like this as a class exercise at PyUK. Raymond From arnodel at googlemail.com Wed Oct 15 22:50:24 2008 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Wed, 15 Oct 2008 21:50:24 +0100 Subject: [Python-ideas] Automatic total ordering In-Reply-To: <91ad5bf80810151135w105fa32dn6404cb2a2ff339ec@mail.gmail.com> References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> <91ad5bf80810151135w105fa32dn6404cb2a2ff339ec@mail.gmail.com> Message-ID: <34DB391A-5DAD-4C92-A778-DC36549666FE@googlemail.com> On 15 Oct 2008, at 19:35, George Sakkis wrote: > On Wed, Oct 15, 2008 at 2:11 PM, Guido van Rossum > wrote: > > On Wed, Oct 15, 2008 at 11:03 AM, Terry Reedy > wrote: > > George Sakkis wrote: > >> > >> Now that 3.x fixes the arbitrary object comparison wart and drops > (?) > >> __cmp__, > > > > An entry for __cmp__ was in the 3.0c1 doc, which confused me. > > It is now gone in > > http://docs.python.org/dev/3.0/reference/datamodel.html#special-method-names > > > > Since rich comparisons are defined on object and are inherited by > all > > classes, it would be difficult to make them not defined. > > I should also note that part of George's proposal has already been > implemented: if you define __eq__, you get a complementary __ne__ for > free. However it doesn't work the other way around (defining __ne__ > doesn't give you __eq__ for free), and there is no similar > relationship for the ordering operators. The reason for the freebie is > that it's *extremely* unlikely to want to define != as something other > than the complement of == (the only use case is IEEE compliant NaNs); > however it's pretty common to define non-total orderings (e.g. set > inclusion). > > Partial orderings are certainly used, but I believe they are far > less common than total ones. Regardless, a partially ordered class > has to explicitly define the supported methods with the desired > semantics anyway; the proposed change wouldn't make this any harder. I don't understand. In a mathematical ordering, * x > y means the same as y < x * x <= y means the same as x < y or x = y * x >= y means the same as x > y or x = y and this is irrespective of whether the ordering is partial or total. So, given __eq__ and __lt__, all the rest follows. E.g. def __gt__(self, other): return other.__lt__(self) etc... Where am I going wrong? -- Arnaud From qrczak at knm.org.pl Wed Oct 15 23:07:27 2008 From: qrczak at knm.org.pl (Marcin 'Qrczak' Kowalczyk) Date: Wed, 15 Oct 2008 23:07:27 +0200 Subject: [Python-ideas] Automatic total ordering In-Reply-To: <34DB391A-5DAD-4C92-A778-DC36549666FE@googlemail.com> References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> <91ad5bf80810151135w105fa32dn6404cb2a2ff339ec@mail.gmail.com> <34DB391A-5DAD-4C92-A778-DC36549666FE@googlemail.com> Message-ID: <3f4107910810151407p29f35e6biec4f88cf054fc533@mail.gmail.com> 2008/10/15 Arnaud Delobelle : >> Partial orderings are certainly used, but I believe they are far less >> common than total ones. Regardless, a partially ordered class has to >> explicitly define the supported methods with the desired semantics anyway; >> the proposed change wouldn't make this any harder. > > I don't understand. In a mathematical ordering, > > * x > y means the same as y < x > * x <= y means the same as x < y or x = y > * x >= y means the same as x > y or x = y > > and this is irrespective of whether the ordering is partial or total. But for a total ordering 'not y < x' is a more efficient definition of 'x <= y' than 'x < y or x == y'. -- Marcin Kowalczyk qrczak at knm.org.pl http://qrnik.knm.org.pl/~qrczak/ From greg.ewing at canterbury.ac.nz Wed Oct 15 23:25:05 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 16 Oct 2008 10:25:05 +1300 Subject: [Python-ideas] Idea: Lazy ... statement In-Reply-To: References: <48F5039C.4050908@gmx.net> Message-ID: <48F65FB1.4050105@canterbury.ac.nz> Jim Jewett wrote: > Part of the problem with setdefault is that calculating the default > can be expensive. There's already a superior replacement for setdefault, i.e. use a defaultdict. -- Greg From george.sakkis at gmail.com Wed Oct 15 23:36:56 2008 From: george.sakkis at gmail.com (George Sakkis) Date: Wed, 15 Oct 2008 17:36:56 -0400 Subject: [Python-ideas] Automatic total ordering In-Reply-To: <34DB391A-5DAD-4C92-A778-DC36549666FE@googlemail.com> References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> <91ad5bf80810151135w105fa32dn6404cb2a2ff339ec@mail.gmail.com> <34DB391A-5DAD-4C92-A778-DC36549666FE@googlemail.com> Message-ID: <91ad5bf80810151436j48d91311yc4b59165d64b1153@mail.gmail.com> On Wed, Oct 15, 2008 at 4:50 PM, Arnaud Delobelle wrote: > > On 15 Oct 2008, at 19:35, George Sakkis wrote: > > On Wed, Oct 15, 2008 at 2:11 PM, Guido van Rossum >> wrote: >> >> On Wed, Oct 15, 2008 at 11:03 AM, Terry Reedy wrote: >> > George Sakkis wrote: >> >> >> >> Now that 3.x fixes the arbitrary object comparison wart and drops (?) >> >> __cmp__, >> > >> > An entry for __cmp__ was in the 3.0c1 doc, which confused me. >> > It is now gone in >> > >> http://docs.python.org/dev/3.0/reference/datamodel.html#special-method-names >> > >> > Since rich comparisons are defined on object and are inherited by all >> > classes, it would be difficult to make them not defined. >> >> I should also note that part of George's proposal has already been >> implemented: if you define __eq__, you get a complementary __ne__ for >> free. However it doesn't work the other way around (defining __ne__ >> doesn't give you __eq__ for free), and there is no similar >> relationship for the ordering operators. The reason for the freebie is >> that it's *extremely* unlikely to want to define != as something other >> than the complement of == (the only use case is IEEE compliant NaNs); >> however it's pretty common to define non-total orderings (e.g. set >> inclusion). >> >> Partial orderings are certainly used, but I believe they are far less >> common than total ones. Regardless, a partially ordered class has to >> explicitly define the supported methods with the desired semantics anyway; >> the proposed change wouldn't make this any harder. >> > > I don't understand. In a mathematical ordering, > > * x > y means the same as y < x > * x <= y means the same as x < y or x = y > * x >= y means the same as x > y or x = y > > and this is irrespective of whether the ordering is partial or total. > > So, given __eq__ and __lt__, all the rest follows. E.g. > > def __gt__(self, other): > return other.__lt__(self) > etc... > > Where am I going wrong? For total orderings, one can define all methods in terms of self.__eq__ and self.__lt__, i.e. avoid making any assumption about `other`, e.g: def __gt__(self, other): return not (self == other or self < other) Your __gt__ definition (which is also what Michael Foord does in his class decorator) may break mysteriously: class Thing(object): def __init__(self, val): self.val = val @total_ordering class Thing_lt(Thing): def __lt__(self, other): return self.val < other.val >>> t1 = Thing_lt(1) >>> t2 = Thing(2) >>> t1 < t2 True >>> t1 > t2 ... RuntimeError: maximum recursion depth exceeded George -------------- next part -------------- An HTML attachment was scrubbed... URL: From bmintern at gmail.com Thu Oct 16 00:05:50 2008 From: bmintern at gmail.com (Brandon Mintern) Date: Wed, 15 Oct 2008 18:05:50 -0400 Subject: [Python-ideas] Automatic total ordering In-Reply-To: <91ad5bf80810151436j48d91311yc4b59165d64b1153@mail.gmail.com> References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> <91ad5bf80810151135w105fa32dn6404cb2a2ff339ec@mail.gmail.com> <34DB391A-5DAD-4C92-A778-DC36549666FE@googlemail.com> <91ad5bf80810151436j48d91311yc4b59165d64b1153@mail.gmail.com> Message-ID: <4c0fccce0810151505v502919b5sdec72885834dd28c@mail.gmail.com> It seems like a lot of people are missing the idea that you *only* need < (or >, <=, >=, with slightly different variations) for total ordering on two objects of the same class, so here it is: x < y x > y -> y < x x <= y --> not y < x x >= y --> not x < y x == y --> not x < y and not y < x x != y --> x < y or y < x At least, that's what I got from the original post, Brandon From jimjjewett at gmail.com Thu Oct 16 01:41:56 2008 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed, 15 Oct 2008 19:41:56 -0400 Subject: [Python-ideas] Idea: Lazy ... statement In-Reply-To: <48F65FB1.4050105@canterbury.ac.nz> References: <48F5039C.4050908@gmx.net> <48F65FB1.4050105@canterbury.ac.nz> Message-ID: On Wed, Oct 15, 2008 at 5:25 PM, Greg Ewing wrote: > Jim Jewett wrote: >> Part of the problem with setdefault is that calculating the default >> can be expensive. > There's already a superior replacement for setdefault, > i.e. use a defaultdict. Only if the default can be calculated by the dict itself. If some information from the calling site is needed, there still isn't a good answer. That is a rare case, but does overlap with the cases where calculation actually is expensive. -jJ From tjreedy at udel.edu Thu Oct 16 06:07:01 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 16 Oct 2008 00:07:01 -0400 Subject: [Python-ideas] Automatic total ordering In-Reply-To: References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> Message-ID: Guido van Rossum wrote: > On Wed, Oct 15, 2008 at 12:12 PM, Terry Reedy wrote: >> interpret the above as saying that we have already done as much as is >> sensible (except for changing the docs). > > I think we've done as much as I am comfortable with doing *by default* > (i.e. when inheriting from object). The rest should be provided via > mix-ins. But even those mix-ins should wait until 3.1. After posting and then thinking some more, I came to the same conclusion, that anything more should be by explicit request. Document the imported tool well and the results are the programmer's responsibility. >> There are several possible basis pairs of defined operations. A >> specification must list which one(s) would work. > > There could be several different mix-ins that implement a total > ordering based on different basis pairs. (Or even a single basis > operation -- *in principle* all you need is a '<' operation.) Raymond's idea of an intelligent (adaptive) decorator (or two?) sounds pretty good too. Terry From george.sakkis at gmail.com Thu Oct 16 07:38:35 2008 From: george.sakkis at gmail.com (George Sakkis) Date: Thu, 16 Oct 2008 01:38:35 -0400 Subject: [Python-ideas] Automatic total ordering In-Reply-To: References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> Message-ID: <91ad5bf80810152238p253dfa2ele518b445d2202c81@mail.gmail.com> On Thu, Oct 16, 2008 at 12:07 AM, Terry Reedy wrote: > Guido van Rossum wrote: > >> On Wed, Oct 15, 2008 at 12:12 PM, Terry Reedy wrote: >> > > interpret the above as saying that we have already done as much as is >>> sensible (except for changing the docs). >>> >> >> I think we've done as much as I am comfortable with doing *by default* >> (i.e. when inheriting from object). The rest should be provided via >> mix-ins. But even those mix-ins should wait until 3.1. >> > > After posting and then thinking some more, I came to the same conclusion, > that anything more should be by explicit request. Can you expand on how you reached this conclusion ? For one thing, total orderings are far more common, so that alone is a strong reason to have them by default, even if it made things harder for partial orderings. Even for those though, the current behavior is not necessarily better: class SetlikeThingie(object): def __init__(self, *items): self.items = set(items) def __eq__(self, other): return self.items == other.items def __ne__(self, other): return self.items != other.items def __lt__(self, other): return self!=other and self.items.issubset(other.items) >>> a = SetlikeThingie(1,2,3) >>> b = SetlikeThingie(2,3,4) >>> assert (a From jason.orendorff at gmail.com Thu Oct 16 21:02:12 2008 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Thu, 16 Oct 2008 14:02:12 -0500 Subject: [Python-ideas] Automatic total ordering In-Reply-To: References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> <200810151417.31320.dillonco@comcast.net> Message-ID: Terry Reedy wrote: > In 3.0, the actual definitions are the C equivalent of > > class object(): > def __eq__(self,other): return id(self) == id(other) > def __ne__(self,other): return id(self) != id(other) > def __lt__(self,other): return NotImplemented Not exactly. For example, object().__eq__(object()) ==> NotImplemented, not False; and __ne__ calls __eq__, at least sometimes. The best I can do is: def __eq__(self, other): if self is other: return True return NotImplemented def __ne__(self, other): # calls PyObject_RichCompare(self, other, Py_EQ)... eq = (self == other) # ...which is kinda like this if eq is NotImplemented: # is this even possible? return NotImplemented return not eq def __lt__(self, other): return NotImplemented This behavior makes sense to me, except for __ne__ calling PyObject_RichCompare, which seems like a bug. If I understand correctly, object.__ne__ should be more like this: def __ne__(self, other): eq = self.__eq__(other) if eq is NotImplemented: return NotImplemented return not eq (When I write something like `self.__eq__(other)`, here and below, what I really mean is something more like what half_richcompare does, avoiding the instance dict and returning NotImplemented if the method is not found.) The current behavior causes __eq__ to be called four times in cases where two seems like enough. Rather confusing. So I think the proposal is to change the other three methods to try using __lt__ and __eq__ in a similar way: def __le__(self, other): # Note: NotImplemented is truthy, so if either of these # returns NotImplemented, __le__ returns NotImplemented. return self.__lt__(other) or self.__eq__(other) def __gt__(self, other): # 'and' isn't quite as convenient for us here as 'or' # was above, so spell out what we want: lt = self.__lt__(other) if lt is NotImplemented: return NotImplemented if lt: return False eq = self.__eq__(other) if eq is NotImplemented: return NotImplemented return not eq def __ge__(self, other): lt = self.__lt__(other) if lt is NotImplemented: return NotImplemented return not lt These methods never call __eq__ without first calling __lt__. That's significant: if __lt__ always returns NotImplemented--the default--then these methods should always return NotImplemented too: we don't want to get a bogus True or False result based on a successful call to __eq__. It would also be nice to stop telling people that: x.__eq__(y) <==> x==y x.__ne__(y) <==> x!=y and so forth, in the docstrings and the language reference, as that's an awfully loose approximation of the truth. -j From greg.ewing at canterbury.ac.nz Fri Oct 17 00:25:33 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 17 Oct 2008 11:25:33 +1300 Subject: [Python-ideas] Automatic total ordering In-Reply-To: <91ad5bf80810152238p253dfa2ele518b445d2202c81@mail.gmail.com> References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> <91ad5bf80810152238p253dfa2ele518b445d2202c81@mail.gmail.com> Message-ID: <48F7BF5D.8010205@canterbury.ac.nz> It still bothers me that there is no longer a way to provide a single method that performs a three-way comparison. Not only because total ordering is the most common case, but because it makes comparing sequences for ordering very inefficient -- you end up comparing everything twice, once for < and once for =. So I'd like to see this addressed somehow in any scheme to revamp the comparison system. One way would be to add a slot such as __compare__ that works like the old __cmp__ except that it can return four possible results -- less, equal, greater or not-equal. Comparison operations would first look for the corresponding individual method, and then fall back on calling __compare__. There would be a function compare(x, y) that first looks for a __compare__ method, then falls back on trying to find individual methods with which to work out the result. If we were to adopt something like this, it would obviate any need for direct translations between the comparison operations, and in fact any such direct translations might get in the way. So the bottom line is that I think such features should be kept in mixins or decorators for the time being, until this can all be thought through properly. -- Greg From dillonco at comcast.net Fri Oct 17 00:54:01 2008 From: dillonco at comcast.net (Dillon Collins) Date: Thu, 16 Oct 2008 18:54:01 -0400 Subject: [Python-ideas] Automatic total ordering In-Reply-To: <48F7BF5D.8010205@canterbury.ac.nz> References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> <91ad5bf80810152238p253dfa2ele518b445d2202c81@mail.gmail.com> <48F7BF5D.8010205@canterbury.ac.nz> Message-ID: <200810161854.01667.dillonco@comcast.net> On Thursday 16 October 2008, Greg Ewing wrote: > It still bothers me that there is no longer a way to > provide a single method that performs a three-way > comparison. Not only because total ordering is the > most common case, but because it makes comparing > sequences for ordering very inefficient -- you end up > comparing everything twice, once for < and once for =. As a note, you can always implement <= as well, thereby reducing the overhead of the 'unimplemented' operations to simply negating their complement. I do this whenever == and < share non-trivial code. From george.sakkis at gmail.com Fri Oct 17 01:14:49 2008 From: george.sakkis at gmail.com (George Sakkis) Date: Thu, 16 Oct 2008 19:14:49 -0400 Subject: [Python-ideas] Automatic total ordering In-Reply-To: <200810161854.01667.dillonco@comcast.net> References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> <91ad5bf80810152238p253dfa2ele518b445d2202c81@mail.gmail.com> <48F7BF5D.8010205@canterbury.ac.nz> <200810161854.01667.dillonco@comcast.net> Message-ID: <91ad5bf80810161614g52b970a5u86d9bd887761baab@mail.gmail.com> On Thu, Oct 16, 2008 at 6:54 PM, Dillon Collins wrote: On Thursday 16 October 2008, Greg Ewing wrote: > > It still bothers me that there is no longer a way to > > provide a single method that performs a three-way > > comparison. Not only because total ordering is the > > most common case, but because it makes comparing > > sequences for ordering very inefficient -- you end up > > comparing everything twice, once for < and once for =. > > As a note, you can always implement <= as well, thereby reducing the > overhead > of the 'unimplemented' operations to simply negating their complement. I > do > this whenever == and < share non-trivial code. How does this help ? If the result of "<=" is True, you still have to differentiate between < and == to know whether to stop or proceed with the next element in the sequence. George -------------- next part -------------- An HTML attachment was scrubbed... URL: From dillonco at comcast.net Fri Oct 17 01:23:21 2008 From: dillonco at comcast.net (Dillon Collins) Date: Thu, 16 Oct 2008 19:23:21 -0400 Subject: [Python-ideas] Automatic total ordering In-Reply-To: <91ad5bf80810161614g52b970a5u86d9bd887761baab@mail.gmail.com> References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> <200810161854.01667.dillonco@comcast.net> <91ad5bf80810161614g52b970a5u86d9bd887761baab@mail.gmail.com> Message-ID: <200810161923.21462.dillonco@comcast.net> On Thursday 16 October 2008, George Sakkis wrote: > On Thu, Oct 16, 2008 at 6:54 PM, Dillon Collins > wrote: > > On Thursday 16 October 2008, Greg Ewing wrote: > > > It still bothers me that there is no longer a way to > > > provide a single method that performs a three-way > > > comparison. Not only because total ordering is the > > > most common case, but because it makes comparing > > > sequences for ordering very inefficient -- you end up > > > comparing everything twice, once for < and once for =. > > > > As a note, you can always implement <= as well, thereby reducing the > > overhead > > of the 'unimplemented' operations to simply negating their complement. I > > do > > this whenever == and < share non-trivial code. > > How does this help ? If the result of "<=" is True, you still have to > differentiate between < and == to know whether to stop or proceed with the > next element in the sequence. Very true. I spaced and was thinking more towards the general case, rather than list sorting specifically. From tjreedy at udel.edu Fri Oct 17 02:27:48 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 16 Oct 2008 20:27:48 -0400 Subject: [Python-ideas] Automatic total ordering In-Reply-To: <91ad5bf80810152238p253dfa2ele518b445d2202c81@mail.gmail.com> References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> <91ad5bf80810152238p253dfa2ele518b445d2202c81@mail.gmail.com> Message-ID: George Sakkis wrote: > I think we've done as much as I am comfortable with doing *by > default* > (i.e. when inheriting from object). The rest should be provided via > mix-ins. But even those mix-ins should wait until 3.1. > > > After posting and then thinking some more, I came to the same > conclusion, that anything more should be by explicit request. > > > Can you expand on how you reached this conclusion ? Some related reasons/feelings: 3.0 perhaps completes a major revision of comparisons from default compare to default not compare and from __cmp__ based to 6 __xx__s based. But I suspect the full ramifications of this are yet to be seen. And there may or may not be refinements yet to me made. (Greg's point also.) I am not yet convinced that anything more automatic would be completely safe. It seems like possibly one bit of magic too much. If there are multiple sensible magics, better to let the programmer import and choose. It is hard to take things away once built in. tjr From tjreedy at udel.edu Fri Oct 17 03:03:48 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 16 Oct 2008 21:03:48 -0400 Subject: [Python-ideas] Optimistic Concurrency In-Reply-To: References: <18676.50712.900520.708841@montanaro-dyndns-org.local> Message-ID: Leif Walsh wrote: > On Tue, Oct 14, 2008 at 1:38 PM, Terry Reedy wrote: >> It does not strike me as practical for Python. > > Probably true, but let's see.... You persuaded me to rethink this... >> 1. It starts with copying data. Clients accessing a database server already >> do this. Threads accessing *shared* data normally do not. > > Agreed, but if I want them to, they should (and I should be able to > tell that to python concisely, and have python understand what that > means for concurrency). Suppose shared data consists of mutable collections. Let the program units that share the date follow this protocol (discipline) for editing members of a collection. old_id = id(member) edit_copy = collection.member # or collection[index/key] or whatever edit edit_copy done = False while not done: atomically: if id(edit_copy) == old_id: collection.member = copy # by reference copy, not data copy done = True if done: break recover(edit_copy) 'Atomically' could be by a very short term lock or perhaps a builtin C API function if such can run 'atomically' (I just don't know). I believe this will work if all editors edit at the same level of granularity. Certainly, mixing levels can easily lead to data loss. >> 2. Any edit (change) may be discarded in part or in whole. Human operators, >> informed of the rejection, must (and 'somehow' do) decide what to do. >> Wrapping every assignment to shared data with a pre-programmed >> RejectionException handler would usually be a huge burden on the programmer. > > It would be a huge burden, but perhaps it could be an option for the > especially ambitious programmer. Or the program might ask a human user, which was the use-case for this idea. From george.sakkis at gmail.com Fri Oct 17 03:59:43 2008 From: george.sakkis at gmail.com (George Sakkis) Date: Thu, 16 Oct 2008 21:59:43 -0400 Subject: [Python-ideas] Automatic total ordering In-Reply-To: References: <91ad5bf80810150757g35979ceel72bd0cd180871f70@mail.gmail.com> <91ad5bf80810152238p253dfa2ele518b445d2202c81@mail.gmail.com> Message-ID: <91ad5bf80810161859w21ac3806hea8ef7b47d5f4d75@mail.gmail.com> On Thu, Oct 16, 2008 at 8:27 PM, Terry Reedy wrote: George Sakkis wrote: > > I think we've done as much as I am comfortable with doing *by >> default* >> (i.e. when inheriting from object). The rest should be provided via >> mix-ins. But even those mix-ins should wait until 3.1. >> >> >> After posting and then thinking some more, I came to the same >> conclusion, that anything more should be by explicit request. >> >> Can you expand on how you reached this conclusion ? >> > > Some related reasons/feelings: > > 3.0 perhaps completes a major revision of comparisons from default compare > to default not compare and from __cmp__ based to 6 __xx__s based. But I > suspect the full ramifications of this are yet to be seen. And there may or > may not be refinements yet to me made. (Greg's point also.) > > I am not yet convinced that anything more automatic would be completely > safe. Neither is the current behavior, and I haven't seen a use case where the proposal makes things less safe than they already are. > It seems like possibly one bit of magic too much. Having total ordering just work by default is too magic ? If there are multiple sensible magics, better to let the programmer import > and choose. > > It is hard to take things away once built in. That's what I take from the whole skepticism, and there's nothing wrong to becoming more conservative as the language matures. Still It's rather ironic that the main motivation of introducing rich comparisons in the first place was to support Numpy, a 3rd party package which most users will never need, while there is resistance to a feature that will benefit the majority. Let's focus then on putting out an explicit approach first (for 2.7 and 3.1) and make it automatic later if there are no side effects. My prediction is that this will follow a path similar to class decorators, which also didn't make it initially for 2.4 but were added eventually (yay!). George -------------- next part -------------- An HTML attachment was scrubbed... URL: From pelotoescogorciao at yahoo.es Fri Oct 17 20:08:16 2008 From: pelotoescogorciao at yahoo.es (Victor Martin Ulloa) Date: Fri, 17 Oct 2008 18:08:16 +0000 (GMT) Subject: [Python-ideas] VS2005 project improvement Message-ID: <556485.64858.qm@web25803.mail.ukl.yahoo.com> >I see project files for VS6, VC7.1 and VS8 in the directory >http://svn.python.org/projects/python/trunk/PC/ :) But the Solution files(.SLN) in the PC folder are just for VS2008. __________________________________________________ Correo Yahoo! Espacio para todos tus mensajes, antivirus y antispam ?gratis! Reg?strate ya - http://correo.yahoo.es From pelotoescogorciao at yahoo.es Fri Oct 17 20:09:39 2008 From: pelotoescogorciao at yahoo.es (Victor Martin Ulloa) Date: Fri, 17 Oct 2008 18:09:39 +0000 (GMT) Subject: [Python-ideas] VS2005 project improvement Message-ID: <670014.78659.qm@web25805.mail.ukl.yahoo.com> And btw, to build static libraries is a nightmare :P You must add Debug_Static and Relese_Static configurations too. __________________________________________________ Correo Yahoo! Espacio para todos tus mensajes, antivirus y antispam ?gratis! Reg?strate ya - http://correo.yahoo.es From pelotoescogorciao at yahoo.es Fri Oct 17 20:12:26 2008 From: pelotoescogorciao at yahoo.es (Victor Martin Ulloa) Date: Fri, 17 Oct 2008 18:12:26 +0000 (GMT) Subject: [Python-ideas] User forums Message-ID: <612083.28503.qm@web25802.mail.ukl.yahoo.com> Why are we using a 90's mailing list? Why not a modern phpBB web forum? __________________________________________________ Correo Yahoo! Espacio para todos tus mensajes, antivirus y antispam ?gratis! Reg?strate ya - http://correo.yahoo.es From clp at rebertia.com Fri Oct 17 20:20:41 2008 From: clp at rebertia.com (Chris Rebert) Date: Fri, 17 Oct 2008 11:20:41 -0700 Subject: [Python-ideas] User forums In-Reply-To: <612083.28503.qm@web25802.mail.ukl.yahoo.com> References: <612083.28503.qm@web25802.mail.ukl.yahoo.com> Message-ID: <47c890dc0810171120i25123bc8hd21a9497dae4f3ad@mail.gmail.com> On Fri, Oct 17, 2008 at 11:12 AM, Victor Martin Ulloa wrote: > Why are we using a 90's mailing list? Why not a modern phpBB web forum? Because GNU Mailman is written in Python where as phpBB is written in...that other language :P More seriously, mailinglists/newsgroups let everyone follow all the conversations that are going on, whereas in forums there's no way (that I've ever heard of) to "subscribe" to all the topics automatically. If it ain't broke, don't fix it. Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com > > __________________________________________________ > Correo Yahoo! > Espacio para todos tus mensajes, antivirus y antispam ?gratis! > Reg?strate ya - http://correo.yahoo.es > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From phd at phd.pp.ru Fri Oct 17 20:29:41 2008 From: phd at phd.pp.ru (Oleg Broytmann) Date: Fri, 17 Oct 2008 22:29:41 +0400 Subject: [Python-ideas] User forums In-Reply-To: <612083.28503.qm@web25802.mail.ukl.yahoo.com> References: <612083.28503.qm@web25802.mail.ukl.yahoo.com> Message-ID: <20081017182941.GA3525@phd.pp.ru> On Fri, Oct 17, 2008 at 06:12:26PM +0000, Victor Martin Ulloa wrote: > Why are we using a 90's mailing list? Why not a modern phpBB web forum? Sometimes... very often, actually... old things are better than new things. The advantages of mailing list over a web forum: -- a mailing list uses "push technology" - new messages arrived in my mailbox, and I can read them or archive them or do what I want; in a web forum I have to hunt for new messages, and don't have a way to archive selected messages; -- I can read mailbox with whatever program I like; for a web forum full of broken HTML, crippled CSS and stupid Javascript I have to use one of those bloated web browsers; -- most mailing lists these days are processed by The Python Mailing List Manager (mailman); using PHP-based software for a Python-related web forum would give users a wrong signal. Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From lists at cheimes.de Fri Oct 17 20:47:30 2008 From: lists at cheimes.de (Christian Heimes) Date: Fri, 17 Oct 2008 20:47:30 +0200 Subject: [Python-ideas] VS2005 project improvement In-Reply-To: <556485.64858.qm@web25803.mail.ukl.yahoo.com> References: <556485.64858.qm@web25803.mail.ukl.yahoo.com> Message-ID: Victor Martin Ulloa wrote: > But the Solution files(.SLN) in the PC folder are just for VS2008. A patch is welcome - for the other issue as well. It's most probably sufficient to change the first line of the solution file. Christian From nad at acm.org Fri Oct 17 21:00:26 2008 From: nad at acm.org (Ned Deily) Date: Fri, 17 Oct 2008 12:00:26 -0700 Subject: [Python-ideas] User forums References: <612083.28503.qm@web25802.mail.ukl.yahoo.com> <20081017182941.GA3525@phd.pp.ru> Message-ID: In article <20081017182941.GA3525 at phd.pp.ru>, Oleg Broytmann wrote: > On Fri, Oct 17, 2008 at 06:12:26PM +0000, Victor Martin Ulloa wrote: > > Why are we using a 90's mailing list? Why not a modern phpBB web forum? > Sometimes... very often, actually... old things are better than new > things. > The advantages of mailing list over a web forum: > -- a mailing list uses "push technology" - new messages arrived in my > mailbox, and I can read them or archive them or do what I want; in a web > forum I have to hunt for new messages, and don't have a way to archive > selected messages; > -- I can read mailbox with whatever program I like; for a web forum full > of broken HTML, crippled CSS and stupid Javascript I have to use one of > those bloated web browsers; > -- most mailing lists these days are processed by The Python Mailing List > Manager (mailman); using PHP-based software for a Python-related web > forum would give users a wrong signal. -- ... and there already exists a web form interface (plus a NNTP newsreader interface and an RSS interface): -- Ned Deily, nad at acm.org From suraj at barkale.com Fri Oct 17 22:51:22 2008 From: suraj at barkale.com (Suraj Barkale) Date: Fri, 17 Oct 2008 20:51:22 +0000 (UTC) Subject: [Python-ideas] User forums References: <612083.28503.qm@web25802.mail.ukl.yahoo.com> Message-ID: Victor Martin Ulloa writes: > > Why are we using a 90's mailing list? Why not a modern phpBB web forum? > Most people pay more attention while replying to an email that replying to a forum post :) At least I don't see many emails containing just "LOL, me too". On a serious note as others have pointed out, email acts as a data exchange format so you can use any interface you like. e.g. for a forum like interface you can look at http://groups.google.com/group/python-ideas/topics From leif.walsh at gmail.com Sat Oct 18 07:28:28 2008 From: leif.walsh at gmail.com (Leif Walsh) Date: Sat, 18 Oct 2008 01:28:28 -0400 Subject: [Python-ideas] Optimistic Concurrency In-Reply-To: References: <18676.50712.900520.708841@montanaro-dyndns-org.local> Message-ID: On Thu, Oct 16, 2008 at 9:03 PM, Terry Reedy wrote: > Suppose shared data consists of mutable collections. > Let the program units that share the date follow this protocol (discipline) > for editing members of a collection. > > old_id = id(member) > edit_copy = collection.member # or collection[index/key] or whatever > edit edit_copy > done = False > while not done: > atomically: > if id(edit_copy) == old_id: > collection.member = copy # by reference copy, not data copy > done = True > if done: break > recover(edit_copy) > > 'Atomically' could be by a very short term lock or perhaps a builtin C API > function if such can run 'atomically' (I just don't know). It feels like, if a new syntax element 'atomically' is added, it would need to know what variable or collection is meant to be atomic. This poses a problem for objects with other references elsewhere, but I think (and I don't know the internal structures of python very well, so this may be way off the mark), the lock could be applied at the data level, not the pointer level, to alleviate this issue. > I believe this will work if all editors edit at the same level of > granularity. Certainly, mixing levels can easily lead to data loss. Again, I think we can fix this by applying the lock to the data. For example (since we talked about this in a class I had today), think about the vfs in linux: Say we have one file with many hardlinks. If we apply a lock to the dentry for the file, and then the user tries to modify our file from a different dentry, the lock won't function. If we apply the lock to the inode, however, it locks it against changes from all hardlinks. Hopefully python has a similar opportunity. > Or the program might ask a human user, which was the use-case for this idea. I don't think I understood the use case. Can you explain it? -- Cheers, Leif From leif.walsh at gmail.com Sat Oct 18 07:30:22 2008 From: leif.walsh at gmail.com (Leif Walsh) Date: Sat, 18 Oct 2008 01:30:22 -0400 Subject: [Python-ideas] User forums In-Reply-To: <612083.28503.qm@web25802.mail.ukl.yahoo.com> References: <612083.28503.qm@web25802.mail.ukl.yahoo.com> Message-ID: On Fri, Oct 17, 2008 at 2:12 PM, Victor Martin Ulloa wrote: > Why are we using a 90's mailing list? Why not a modern phpBB web forum? I like email that I can read with whatever program I want, and of which you can view the archives with whatever program you want. -- Cheers, Leif From mnordhoff at mattnordhoff.com Sat Oct 18 11:31:49 2008 From: mnordhoff at mattnordhoff.com (Matt Nordhoff) Date: Sat, 18 Oct 2008 09:31:49 +0000 Subject: [Python-ideas] Idea: Lazy import statement In-Reply-To: <48F3030A.6050107@canterbury.ac.nz> References: <48F3030A.6050107@canterbury.ac.nz> Message-ID: <48F9AD05.6080400@mattnordhoff.com> Greg Ewing wrote: > Problem: You have a package containing a large number of > classes, of which only a few are typically used by any > given application. > > If you put each class into its own submodule, then > client code is required to use a lot of tedious > 'from foo.thingy import Thingy' statements to import > the classes it wants to use. This also makes all the > submodule names part of the API and makes it hard to > rearrange the packaging without breaking code. > > If you try to flatten the namespace by importing all > the classes into the top level module, you end up > importing everything even if it won't be used. > > What's needed is a way of lazily importing them, so > that the import won't actually happen unless the > imported names are referenced. > > It's possible to hack something like that up now, but > then tools such as py2app and py2exe, that try to find > modules by statically examining the source looking for > import statements, won't be able to accurately determine > which modules are used. At best they'll think the > whole package is used and incorporate all of it; at > worst they'll miss it altogether. > > So I think it would be good to have a dedicated syntax > for lazy imports, so the top-level foo package can say > something like > > from foo.thing lazily import Thing > from foo.stuff lazily import Stuff > ... > > Executing a lazy import statement adds an entry to a > list of deferred imports attached to the module. Then, > the first time the imported name is referenced, the > import is performed and the name becomes an ordinary > attribute thereafter. > > If py2exe et al are taught about lazy imports, they > will then be able to determine exactly which submodules > are used by an application and exclude the rest. FWIW, PyPy can lazily compute objects: I'm don't know how easy it would be to tie that into imports, but I'm sure it would be possible. Coming from the Bazaar and Mercurial communities, I'd like lazy imports too, but I doubt it'll ever happen because it's just a strange performance hack. -- From tjreedy at udel.edu Sat Oct 18 20:00:58 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 18 Oct 2008 14:00:58 -0400 Subject: [Python-ideas] Optimistic Concurrency In-Reply-To: References: <18676.50712.900520.708841@montanaro-dyndns-org.local> Message-ID: Leif Walsh wrote: > For example (since we talked about this in a class I had today), think > about the vfs in linux: Out of my knowledge range. >> Or the program might ask a human user, which was the use-case for this idea. > > I don't think I understood the use case. Can you explain it? Near the bottom of the original article this thread is based on. Casual users requesting html forms, maybe submitting then, maybe not, with low collision rate, and ability to adjust to collisions. tjr From leif.walsh at gmail.com Sat Oct 18 20:50:40 2008 From: leif.walsh at gmail.com (Leif Walsh) Date: Sat, 18 Oct 2008 14:50:40 -0400 Subject: [Python-ideas] Optimistic Concurrency In-Reply-To: References: <18676.50712.900520.708841@montanaro-dyndns-org.local> Message-ID: On Sat, Oct 18, 2008 at 2:00 PM, Terry Reedy wrote: > Out of my knowledge range. Well, don't worry, it's mostly still out of mine as well. Thinking about the problem more, I can't come up with a reasonable way to do the locking needed for that last bit of conflict resolution. Maybe someone else can. > Near the bottom of the original article this thread is based on. > Casual users requesting html forms, maybe submitting then, maybe not, with > low collision rate, and ability to adjust to collisions. I see that now, but wasn't the original post about removing the GIL? That seems to imply that the users would be different threads in a program, with high speed and possibly high collision rate. If we are talking about users communicating over http, this seems like something you'd write a program in python to do (like wikipedia says Rails does), and it doesn't seem to merit discussion in the development of python itself. -- Cheers, Leif From tjreedy at udel.edu Sun Oct 19 00:23:35 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 18 Oct 2008 18:23:35 -0400 Subject: [Python-ideas] Optimistic Concurrency In-Reply-To: References: <18676.50712.900520.708841@montanaro-dyndns-org.local> Message-ID: Leif Walsh wrote: > I see that now, but wasn't the original post about removing the GIL? > That seems to imply that the users would be different threads in a > program, with high speed and possibly high collision rate. When I said 'impractical', I was thinking about this sort of situation. 'Optimistic Concurrency' is optimistic about not colliding ;-). Else the recovery overhead makes it worse than locking. > If we are talking about users communicating over http, this seems like > something you'd write a program in python to do (like wikipedia says > Rails does), and it doesn't seem to merit discussion in the > development of python itself. Servers have several threads, each communicating with and representing to the rest of the system one user. When those several threads all edit shared data, should they have to acquire GIL, local locks, or none at all? The point of the article is that 'no lock' is desirable when many locks would be left to expire unused and often also possible. tjr From leif.walsh at gmail.com Sun Oct 19 07:36:50 2008 From: leif.walsh at gmail.com (Leif Walsh) Date: Sun, 19 Oct 2008 01:36:50 -0400 Subject: [Python-ideas] Optimistic Concurrency In-Reply-To: References: <18676.50712.900520.708841@montanaro-dyndns-org.local> Message-ID: On Sat, Oct 18, 2008 at 6:23 PM, Terry Reedy wrote: > When I said 'impractical', I was thinking about this sort of situation. > 'Optimistic Concurrency' is optimistic about not colliding ;-). > Else the recovery overhead makes it worse than locking. I see now. >> If we are talking about users communicating over http, this seems like >> something you'd write a program in python to do (like wikipedia says >> Rails does), and it doesn't seem to merit discussion in the >> development of python itself. > > Servers have several threads, each communicating with and representing to > the rest of the system one user. When those several threads all edit shared > data, should they have to acquire GIL, local locks, or none at all? The > point of the article is that 'no lock' is desirable when many locks would be > left to expire unused and often also possible. I'm still not convinced that this should be an implementation concern, rather than a language concern, but that's probably okay. -- Cheers, Leif From pelotoescogorciao at yahoo.es Sun Oct 19 22:43:26 2008 From: pelotoescogorciao at yahoo.es (Victor Martin Ulloa) Date: Sun, 19 Oct 2008 20:43:26 +0000 (GMT) Subject: [Python-ideas] VS2005 project improvement Message-ID: <89547.23361.qm@web25808.mail.ukl.yahoo.com> I see this code in the python3.0rc1 makebuild_info project for VS2005: int main(int argc, char*argv[]) { char command[500] = "cl.exe -c -D_WIN32 -DUSE_DL_EXPORT -D_WINDOWS -DWIN32 -D_WINDLL "; int do_unlink, result; if (argc != 2) { fprintf(stderr, "make_buildinfo $(ConfigurationName)\n"); return EXIT_FAILURE; } if (strcmp(argv[1], "Release") == 0) { strcat_s(command, CMD_SIZE, "-MD "); } else if (strcmp(argv[1], "Debug") == 0) { strcat_s(command, CMD_SIZE, "-D_DEBUG -MDd "); } else if (strcmp(argv[1], "ReleaseItanium") == 0) { strcat_s(command, CMD_SIZE, "-MD /USECL:MS_ITANIUM "); } else if (strcmp(argv[1], "ReleaseAMD64") == 0) { strcat_s(command, CMD_SIZE, "-MD "); strcat_s(command, CMD_SIZE, "-MD /USECL:MS_OPTERON "); } else { fprintf(stderr, "unsupported configuration %s\n", argv[1]); return EXIT_FAILURE; } I don't think to hardcode the configurations would be a good idea. I went crazy trying to setup a static library(or a custom configuration) and it was difficult to figure things like that... pls fix it, that's not the good way. __________________________________________________ Correo Yahoo! Espacio para todos tus mensajes, antivirus y antispam ?gratis! Reg?strate ya - http://correo.yahoo.es From collinw at gmail.com Sun Oct 19 23:25:31 2008 From: collinw at gmail.com (Collin Winter) Date: Sun, 19 Oct 2008 14:25:31 -0700 Subject: [Python-ideas] Optimistic Concurrency In-Reply-To: <18676.50712.900520.708841@montanaro-dyndns-org.local> References: <18676.50712.900520.708841@montanaro-dyndns-org.local> Message-ID: <43aa6ff70810191425t46a6315es51f7c75cf5b827e2@mail.gmail.com> On Tue, Oct 14, 2008 at 9:17 AM, wrote: > Is optimistic concurrency > > http://en.wikipedia.org/wiki/Optimistic_concurrency_control > > a possible option for removing the GIL in Python? Not really. Automatically retrying generic operations when there's a conflict more-or-less requires a pure functional language, since you want to absolutely minimize side-effects when you're replaying the failed operation. See the implementation of STM in Haskell's GHC compiler for more. Collin Winter From pelotoescogorciao at yahoo.es Mon Oct 20 00:28:10 2008 From: pelotoescogorciao at yahoo.es (Victor Martin Ulloa) Date: Sun, 19 Oct 2008 22:28:10 +0000 (GMT) Subject: [Python-ideas] Python's core needs to be more lightweight Message-ID: <617361.7381.qm@web25805.mail.ukl.yahoo.com> I noticed a minimalistic pytoncore compiled with VS2005 ( eliminating COMDAT and un-used references, optimized for size ) passes the 2Mb of .LIB ( statically linked ). I think that's a lot considering that LUA uses less than 100Kb. For small game consoles and devices very small RAM and storage would be very good to reduce this size. thx __________________________________________________ Correo Yahoo! Espacio para todos tus mensajes, antivirus y antispam ?gratis! Reg?strate ya - http://correo.yahoo.es From skip at pobox.com Mon Oct 20 04:05:45 2008 From: skip at pobox.com (skip at pobox.com) Date: Sun, 19 Oct 2008 21:05:45 -0500 Subject: [Python-ideas] Optimistic Concurrency In-Reply-To: References: Message-ID: <18683.59257.716279.687450@montanaro-dyndns-org.local> Leif> I see that now, but wasn't the original post about removing the Leif> GIL? That seems to imply that the users would be different Leif> threads in a program, with high speed and possibly high collision Leif> rate. Yes, it was. (I'm the OP.) I'm curious about the assertion of "possibly high collision rate". Do you have measurements to support that? If the collision rate is high enough you'd be replaying operations all the time. OTOH, if the actual collision rate is quite low the replay operation, even if it is onerous, can be tolerated. Someone else mentioned the problem with side effects in non-functional languages. That would indeed seem to be a problem with C (the level I see this operating at.) I have no desire to add further load to the Python programmer. Programming with multiple threads is already hard enough. Clearly the GIL is too coarse a level of locking because it eliminates all possible parallelism. Is it possible that some finer grained locking (but still coarser than complete free threading) can give you back some of the possible parallelism? Maybe reqire all mutable types to support a GIL of their own? Skip From santagada at gmail.com Mon Oct 20 05:16:48 2008 From: santagada at gmail.com (Leonardo Santagada) Date: Mon, 20 Oct 2008 01:16:48 -0200 Subject: [Python-ideas] Optimistic Concurrency In-Reply-To: <18683.59257.716279.687450@montanaro-dyndns-org.local> References: <18683.59257.716279.687450@montanaro-dyndns-org.local> Message-ID: <9A3F34F2-F457-4A26-8185-B0EC667A4814@gmail.com> On Oct 20, 2008, at 12:05 AM, skip at pobox.com wrote: > Clearly the GIL is too coarse a level of locking because it > eliminates all > possible parallelism. Is it possible that some finer grained > locking (but > still coarser than complete free threading) can give you back some > of the > possible parallelism? Maybe reqire all mutable types to support a > GIL of > their own? This is the idea that they are planing to do in PyPy. Another idea that maybe have a chance of working is implementing Software Transactional Memory on the interpreter level and use it to implement threading... and maybe have an application (that is, able to be used inside a python program) level STM to make programs more easily parallel (so no need to use locking, semaphores and monitors). I think that trying to do all that in CPython is nearly impossible, as removing refcount, move to finer grained locking and testing all in the current CPython would be a lot of work to do and take a lot of time, to then having to resync with the modifications that are bound to happen during that time in CPython. Also keeping compatibility with the CPython external API would be very hard (but doable, there is even some people trying to do it for running numpy on ironpython). Just my 0,02 cents []'s -- Leonardo Santagada santagada at gmail.com From santagada at gmail.com Mon Oct 20 05:20:17 2008 From: santagada at gmail.com (Leonardo Santagada) Date: Mon, 20 Oct 2008 01:20:17 -0200 Subject: [Python-ideas] Python's core needs to be more lightweight In-Reply-To: <617361.7381.qm@web25805.mail.ukl.yahoo.com> References: <617361.7381.qm@web25805.mail.ukl.yahoo.com> Message-ID: On Oct 19, 2008, at 8:28 PM, Victor Martin Ulloa wrote: > I noticed a minimalistic pytoncore compiled with VS2005 > ( eliminating COMDAT and un-used references, optimized for size ) > passes the 2Mb of .LIB ( statically linked ). I think that's a lot > considering that LUA uses less than 100Kb. > > For small game consoles and devices very small RAM and storage would > be very good to reduce this size. Just the unicode tables are larger than 100kb (in pypy the number seems to be 300kb dunno about CPython). There was recently in pypy a lot of work to discover what is that takes each part of the pypy executable so maybe you could adapt most of the ideas to discover what is each part of those 2MB. Good Luck, []'s -- Leonardo Santagada santagada at gmail.com From bborcic at gmail.com Mon Oct 20 15:25:29 2008 From: bborcic at gmail.com (Boris Borcic) Date: Mon, 20 Oct 2008 15:25:29 +0200 Subject: [Python-ideas] idea: Template.match function In-Reply-To: <7517F8D5-B172-4FAE-9D8E-4C930BD057D8@strout.net> References: <9E08028E-B227-4730-A9FC-D9F416DF9F9D@strout.net> <18DB2ED7-142B-4CA5-8DF6-5278EE98742C@strout.net> <5936F297-CF5D-46C6-A181-FB9E29C307B1@gmail.com> <7517F8D5-B172-4FAE-9D8E-4C930BD057D8@strout.net> Message-ID: Joe Strout wrote: > > The point is, we already have a very pretty Template class that does > this operation in one direction; it ought to do it in the other > direction too. The fact that it doesn't is surprising to a newbie BTW, a similar comment goes for delim.join() vs delim.split() Cheers, BB From leif.walsh at gmail.com Mon Oct 20 16:00:57 2008 From: leif.walsh at gmail.com (Leif Walsh) Date: Mon, 20 Oct 2008 10:00:57 -0400 Subject: [Python-ideas] Optimistic Concurrency In-Reply-To: <18683.59257.716279.687450@montanaro-dyndns-org.local> References: <18683.59257.716279.687450@montanaro-dyndns-org.local> Message-ID: On Sun, Oct 19, 2008 at 10:05 PM, wrote: > I'm curious about the assertion of "possibly high collision rate". Do you > have measurements to support that? Nope. :-) > If the collision rate is high enough > you'd be replaying operations all the time. OTOH, if the actual collision > rate is quite low the replay operation, even if it is onerous, can be > tolerated. This is why I suggested that there should be some kind of mode switch available, so I can decide when it's worth it to switch. > Someone else mentioned the problem with side effects in > non-functional languages. That would indeed seem to be a problem with C > (the level I see this operating at.) I have no desire to add further load > to the Python programmer. Programming with multiple threads is already hard > enough. Yeah, side-effects make this nasty. -- Cheers, Leif From skip at pobox.com Mon Oct 20 16:31:02 2008 From: skip at pobox.com (skip at pobox.com) Date: Mon, 20 Oct 2008 09:31:02 -0500 Subject: [Python-ideas] Optimistic Concurrency In-Reply-To: <9A3F34F2-F457-4A26-8185-B0EC667A4814@gmail.com> References: <18683.59257.716279.687450@montanaro-dyndns-org.local> <9A3F34F2-F457-4A26-8185-B0EC667A4814@gmail.com> Message-ID: <18684.38438.946009.647664@montanaro-dyndns-org.local> Leonardo> Another idea that maybe have a chance of working is Leonardo> implementing Software Transactional Memory on the interpreter Leonardo> level and use it to implement threading... That's the term I was looking for, but couldn't remember it. I saw an article on STM a couple months ago and thought that might work for Python. Aren't "optimistic concurrency" and "software transactional memory" very similar concepts? Skip From tjreedy at udel.edu Mon Oct 20 21:46:12 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 20 Oct 2008 15:46:12 -0400 Subject: [Python-ideas] idea: Template.match function In-Reply-To: References: <9E08028E-B227-4730-A9FC-D9F416DF9F9D@strout.net> <18DB2ED7-142B-4CA5-8DF6-5278EE98742C@strout.net> <5936F297-CF5D-46C6-A181-FB9E29C307B1@gmail.com> <7517F8D5-B172-4FAE-9D8E-4C930BD057D8@strout.net> Message-ID: Boris Borcic wrote: > Joe Strout wrote: >> >> The point is, we already have a very pretty Template class that does >> this operation in one direction; it ought to do it in the other >> direction too. The fact that it doesn't is surprising to a newbie > > BTW, a similar comment goes for delim.join() vs delim.split() A phrase like 'similar comment' is sometimes hard to expand. Are you saying that in 3.x .split should produce an iterator instead of a list? Or that ''.split(s) should return list(s) instead of [''] as now (in 3.0 at least). From rhamph at gmail.com Mon Oct 20 22:12:12 2008 From: rhamph at gmail.com (Adam Olsen) Date: Mon, 20 Oct 2008 14:12:12 -0600 Subject: [Python-ideas] Optimistic Concurrency In-Reply-To: <18684.38438.946009.647664@montanaro-dyndns-org.local> References: <18683.59257.716279.687450@montanaro-dyndns-org.local> <9A3F34F2-F457-4A26-8185-B0EC667A4814@gmail.com> <18684.38438.946009.647664@montanaro-dyndns-org.local> Message-ID: On Mon, Oct 20, 2008 at 8:31 AM, wrote: > Leonardo> Another idea that maybe have a chance of working is > Leonardo> implementing Software Transactional Memory on the interpreter > Leonardo> level and use it to implement threading... > > That's the term I was looking for, but couldn't remember it. I saw an > article on STM a couple months ago and thought that might work for Python. > Aren't "optimistic concurrency" and "software transactional memory" very > similar concepts? They're essentially the same. Optimistic concurrency is one of the tools used by STS. To counter Terry's example, I believe it should look like this: transaction: edit collection.member # or collection[index/key] or whatever IOW, the collection container should automatically create a view when you touch it, adding it to the current transaction. When the transaction block completes, the language attempts to atomically commit all the views, restarting if there's a conflict. This has substantial problems if you mix a very long transaction with many short ones. Various ways of prioritizing restarted transactions are possible, at the risk of locking everybody out if the long transaction takes too long (or never completes). Also note that this doesn't require optimistic concurrency. It's just as valid to raise an exception when editing if another commit has invalidated your transaction. In fact, since views are added lazily, this is more-or-less required. Stepping back a bit, there's two distinct problems to solve: 1) Making threading easier 2) Removing the GIL #1 can be done by things like monitors and transactions (the two are complementary). #2 at its core is about refcounting, and transactions are irrelevant there ? you need to switch to a tracing-based GC implementation -- Adam Olsen, aka Rhamphoryncus From santagada at gmail.com Mon Oct 20 22:33:11 2008 From: santagada at gmail.com (Leonardo Santagada) Date: Mon, 20 Oct 2008 18:33:11 -0200 Subject: [Python-ideas] Optimistic Concurrency In-Reply-To: References: <18683.59257.716279.687450@montanaro-dyndns-org.local> <9A3F34F2-F457-4A26-8185-B0EC667A4814@gmail.com> <18684.38438.946009.647664@montanaro-dyndns-org.local> Message-ID: <774A2CFA-9E7F-4EDA-863E-A306B89302AD@gmail.com> On Oct 20, 2008, at 6:12 PM, Adam Olsen wrote: > Stepping back a bit, there's two distinct problems to solve: > 1) Making threading easier > 2) Removing the GIL > > #1 can be done by things like monitors and transactions (the two are > complementary). #2 at its core is about refcounting, and transactions > are irrelevant there ? you need to switch to a tracing-based GC > implementation PyPy has a real GC and doesn't depend on refcount already. So that is not the only thing needed to accomplish #2, you need to lock mutable data containers so concurrent threads don't leave them in a broken state. This is the part not ready on PyPy yet. But it is probably easier to fix on PyPy than on CPython. []'s -- Leonardo Santagada santagada at gmail.com From rhamph at gmail.com Mon Oct 20 22:58:25 2008 From: rhamph at gmail.com (Adam Olsen) Date: Mon, 20 Oct 2008 14:58:25 -0600 Subject: [Python-ideas] Optimistic Concurrency In-Reply-To: <774A2CFA-9E7F-4EDA-863E-A306B89302AD@gmail.com> References: <18683.59257.716279.687450@montanaro-dyndns-org.local> <9A3F34F2-F457-4A26-8185-B0EC667A4814@gmail.com> <18684.38438.946009.647664@montanaro-dyndns-org.local> <774A2CFA-9E7F-4EDA-863E-A306B89302AD@gmail.com> Message-ID: On Mon, Oct 20, 2008 at 2:33 PM, Leonardo Santagada wrote: > > On Oct 20, 2008, at 6:12 PM, Adam Olsen wrote: > >> Stepping back a bit, there's two distinct problems to solve: >> 1) Making threading easier >> 2) Removing the GIL >> >> #1 can be done by things like monitors and transactions (the two are >> complementary). #2 at its core is about refcounting, and transactions >> are irrelevant there ? you need to switch to a tracing-based GC >> implementation > > > PyPy has a real GC and doesn't depend on refcount already. So that is not > the only thing needed to accomplish #2, you need to lock mutable data > containers so concurrent threads don't leave them in a broken state. This is > the part not ready on PyPy yet. > > But it is probably easier to fix on PyPy than on CPython. Although #2 does depend on it, providing sane and safe semantics for the datastructures is the domain of #1. Keep in mind that very little in python actually needs implicit concurrent mutation (class and module dicts do). Implicit thread interaction is the fundamental problem of the current threading models, for all the reasons spelled out in the zen. -- Adam Olsen, aka Rhamphoryncus From dillonco at comcast.net Tue Oct 21 04:11:14 2008 From: dillonco at comcast.net (Dillon Collins) Date: Mon, 20 Oct 2008 22:11:14 -0400 Subject: [Python-ideas] Optimistic Concurrency In-Reply-To: References: <18684.38438.946009.647664@montanaro-dyndns-org.local> Message-ID: <200810202211.14185.dillonco@comcast.net> On Monday 20 October 2008, Adam Olsen wrote: > Stepping back a bit, there's two distinct problems to solve: > 1) Making threading easier > 2) Removing the GIL As I've not seen it mentioned in this thread, I feel that it's worth pointing out that 2.6's new multiprocessing module allows for programs to use more than one CPU, thus eventually eliminating the GIL: http://docs.python.org/dev/2.6/library/multiprocessing.html In light of this I doubt we'll see the GIL removed from CPython pretty much ever. The GIL does have the benefit of speeding up CPython fairly significantly by removing lock overhead, and simplifies the code. It can also be released by C libs for things that will take a long time (which are often best in C for that reason anyway). So, while removing the GIL is fun from an intellectual perspective, it's practical value is now somewhat limited. Sure, multiprocessing isn't perfect, but it is the first thing to bypass the GIL that's been accepted ;). From rhamph at gmail.com Tue Oct 21 04:43:14 2008 From: rhamph at gmail.com (Adam Olsen) Date: Mon, 20 Oct 2008 20:43:14 -0600 Subject: [Python-ideas] Optimistic Concurrency In-Reply-To: <200810202211.14185.dillonco@comcast.net> References: <18684.38438.946009.647664@montanaro-dyndns-org.local> <200810202211.14185.dillonco@comcast.net> Message-ID: On Mon, Oct 20, 2008 at 8:11 PM, Dillon Collins wrote: > On Monday 20 October 2008, Adam Olsen wrote: >> Stepping back a bit, there's two distinct problems to solve: >> 1) Making threading easier >> 2) Removing the GIL > > As I've not seen it mentioned in this thread, I feel that it's worth pointing > out that 2.6's new multiprocessing module allows for programs to use more > than one CPU, thus eventually eliminating the GIL: > http://docs.python.org/dev/2.6/library/multiprocessing.html > > In light of this I doubt we'll see the GIL removed from CPython pretty much > ever. The GIL does have the benefit of speeding up CPython fairly > significantly by removing lock overhead, and simplifies the code. It can > also be released by C libs for things that will take a long time (which are > often best in C for that reason anyway). > > So, while removing the GIL is fun from an intellectual perspective, it's > practical value is now somewhat limited. Sure, multiprocessing isn't > perfect, but it is the first thing to bypass the GIL that's been accepted ;). The GIL won't be removed from CPython, but that's because it'd be a near-total-rewrite to do effectively. multiprocessing is useful when there's a limited amount of shared data. Problems that involve large amounts of shared data will still require threading. You can gain scalability by moving the logic into C, but as the number of CPUs increases so does the amount that must be in C. Eventually you reach a point where using python is counterproductive. As far as I'm concerned we can already solve this: safethread defines the semantics (#1) and other implementations like PyPy can handle the scalability (#2). What we lack is a significant number of people that *need* this solved. -- Adam Olsen, aka Rhamphoryncus From skip at pobox.com Tue Oct 21 20:54:10 2008 From: skip at pobox.com (skip at pobox.com) Date: Tue, 21 Oct 2008 13:54:10 -0500 Subject: [Python-ideas] Optimistic Concurrency In-Reply-To: References: Message-ID: <18686.9554.67087.428408@montanaro-dyndns-org.local> Adam> What we lack is a significant number of people that *need* this Adam> solved. Most people *think* they need this solved though. ;-) Skip From bborcic at gmail.com Wed Oct 22 12:45:09 2008 From: bborcic at gmail.com (Boris Borcic) Date: Wed, 22 Oct 2008 12:45:09 +0200 Subject: [Python-ideas] idea: Template.match function In-Reply-To: References: <9E08028E-B227-4730-A9FC-D9F416DF9F9D@strout.net> <18DB2ED7-142B-4CA5-8DF6-5278EE98742C@strout.net> <5936F297-CF5D-46C6-A181-FB9E29C307B1@gmail.com> <7517F8D5-B172-4FAE-9D8E-4C930BD057D8@strout.net> Message-ID: Terry Reedy wrote: > Boris Borcic wrote: >> Joe Strout wrote: >>> >>> The point is, we already have a very pretty Template class that does >>> this operation in one direction; it ought to do it in the other >>> direction too. The fact that it doesn't is surprising to a newbie >> >> BTW, a similar comment goes for delim.join() vs delim.split() > > A phrase like 'similar comment' is sometimes hard to expand. > Are you saying that in 3.x .split should produce an iterator instead of > a list? Or that ''.split(s) should return list(s) instead of [''] as > now (in 3.0 at least). The latter, eg sep.join(sep.split(s))==s. But somewhat tongue-in-cheek. More generally, I guess what I am saying is that sequence-of-chars <--> string conversion is a particularly sore spot when someone tries to think/learn about the operations in Python in a structuralist or "mathematical" manner. There are three quite distinct manners to infer an operation that *should* convert back list(s) to s, but none work. Cheers, BB From alexandre at peadrop.com Thu Oct 23 04:07:24 2008 From: alexandre at peadrop.com (Alexandre Vassalotti) Date: Wed, 22 Oct 2008 22:07:24 -0400 Subject: [Python-ideas] Idea: Lazy import statement In-Reply-To: <48F3030A.6050107@canterbury.ac.nz> References: <48F3030A.6050107@canterbury.ac.nz> Message-ID: On Mon, Oct 13, 2008 at 4:12 AM, Greg Ewing wrote: > What's needed is a way of lazily importing them, so > that the import won't actually happen unless the > imported names are referenced. > I think Christian Heimes started some PEP about this. If I recall correctly, supporting lazy imports was part of PEP 369 (Post Import Hooks), but later removed from the PEP due to time constraints (I think). -- Alexandre From greg.ewing at canterbury.ac.nz Fri Oct 31 06:41:56 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 31 Oct 2008 18:41:56 +1300 Subject: [Python-ideas] Recognise __init__.so Message-ID: <490A9AA4.2060900@canterbury.ac.nz> We've just been having a discussion on the Cython list about implementing the __init__ file of a package as an extension module. It turns out that just putting an __init__.so file in a directory isn't enough to make Python recognise it as a package. However, if there is both an __init__.py and an __init__.so, it's recognised as a package, and the __init__.so gets loaded. So it's possible, but only by exploiting some undocumented and probably accidental behaviour. So how about officially recognising an __init__.so file as a package main file? -- Greg From guido at python.org Fri Oct 31 16:06:21 2008 From: guido at python.org (Guido van Rossum) Date: Fri, 31 Oct 2008 08:06:21 -0700 Subject: [Python-ideas] Recognise __init__.so In-Reply-To: <490A9AA4.2060900@canterbury.ac.nz> References: <490A9AA4.2060900@canterbury.ac.nz> Message-ID: This was a shortcut when we couldn't think of a reason why people would ever have a __init__.so. Feel free to submit a patch but keep in mind that there are probably a bunch of different places in tools (not necessarily under our control) that also make this check. Why not have an __init__.py that imports _initialize.so or something? On Thu, Oct 30, 2008 at 10:41 PM, Greg Ewing wrote: > We've just been having a discussion on the Cython list > about implementing the __init__ file of a package as > an extension module. > > It turns out that just putting an __init__.so file > in a directory isn't enough to make Python recognise > it as a package. > > However, if there is both an __init__.py and an > __init__.so, it's recognised as a package, and the > __init__.so gets loaded. So it's possible, but only > by exploiting some undocumented and probably accidental > behaviour. > > So how about officially recognising an __init__.so > file as a package main file? > > -- > Greg > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (home page: http://www.python.org/~guido/)