[Python-ideas] Explicit variable capture list

Steven D'Aprano steve at pearwood.info
Mon Jan 25 18:21:36 EST 2016


Excellent summary, thank you, but I want to take exception to something 
you wrote. I fear that you have inadvertently derailed the thread into a 
considerably narrower focus than it should have.

On Fri, Jan 22, 2016 at 08:50:52PM -0800, Andrew Barnert wrote:

> What the thread is ultimately looking for is a solution to the 
> "closures capturing loop variables" problem. This problem has been in 
> the official programming FAQ[1] for decades, as "Why do lambdas 
> defined in a loop with different values all return the same result"?

The issue is not loop variables, or rather, it's not *only* loop 
variables, and so any solution which focuses on fixing loop variables is 
only half a solution. If we look back at Haael's original post, his 
example captures *three* variables, not one, and there is no suggestion 
that they are necessarily loop variables.

It's nice that since we have lambda and list comps we can 
occasionally write closures in a one-liner loop like so:

>     powers = [lambda x: x**i for i in range(10)]
> 
> This gives you ten functions that all return x**9, which is probably 
> not what you wanted.

but in my option, that's really a toy example suitable only for 
demonstrating the nature of the issue and the difference between early 
and late binding. Outside of such toys, we often find ourselves closing 
over at least one variable which is derived from the loop variable, but 
not the loop variable itself:

# Still a toy, but perhaps a bit more of a realistic toy.
searchers = []
for provider in search_provider:
    key = API_KEYS[provider]
    url = SEARCH_URLS[provider]
    def lookup(*terms):
        terms = "/q=" + "+".join(escape(t) for t in terms)
        u = url + ("key=%s" % key) + terms
        return fetch(u) or []
    searchers.append(lookup)



> The OP proposed that we should add some syntax, borrowed from C++, to 
> function definitions that specifies that some things get captured by 
> value.
[...]

Regardless of the syntax chosen, this has a few things to recommend it:

- It's completely explicit. If you want a value captured, you 
have to say so explicitly, otherwise you will get the normal variable 
lookup behaviour that Python uses now.

- It's general. We can capture locals, nonlocals, globals or builtins, 
not just loop variables.

- It allows us to avoid the "default argument" idiom, in cases where we 
really don't want the argument, we just want to capture the value. There 
are a lot of functions which have their parameter list polluted by 
extraneous arguments that should never be used by the caller simply 
because that's the only way to get early binding/value capturing.



> Finally, Terry suggested a completely different solution to the 
> problem: don't change closures; change for loops. Make them create a 
> new variable each time through the loop, instead of reusing the same 
> variable. When the variable isn't captured, this would make no 
> difference, but when it is, closures from different iterations would 
> capture different variables (and therefore different cells).

It was actually Greg, not Terry.

I strongly dislike this suggestion (sorry Greg), and I am concerned that 
the thread seems to have been derailed into treating loop variables as 
special enough to break the rules. It does nothing to solve the general 
problem of capturing values. It doesn't work for my "searchers" example 
above, or even the toy example here:

funcs = []
for i in range(10):
    n = i**2
    funcs.append(lambda x: x + n)


This example can be easily re-written to close over the loop variable 
directly, that's not the point. The point is that we frequently need to 
capture more than just the loop variable. Coming up with a solution that 
only solves the issue for loop variables isn't enough, and it is a 
mistake to think that this is about "closures capturing loop variables".

I won't speak for other languages, but in Python, where loops don't 
introduce a new scope, "closures capturing loop variables" shouldn't 
even be seen as a seperate problem from the more general issue of 
capturing values early rather than late. It's just a common, easily 
stumbled across, manifestation of the same.


> For 
> backward-compatibility reasons, this might have to be optional, which 
> means new syntax; he proposed "for new i in range(10):".

I would not like to see "new" become a keyword. I have a lot of code 
using new (and old) as a variable.



-- 
Steve


More information about the Python-ideas mailing list