[Python-ideas] PEP 572: Statement-Local Name Bindings, take three!

Steven D'Aprano steve at pearwood.info
Sat Mar 24 10:44:34 EDT 2018


On Sat, Mar 24, 2018 at 07:12:49PM +1000, Nick Coghlan wrote:

> > I think that needs justification by more than just "it makes the
> > implementation easier".
>
> Introducing the new scoping behaviour doesn't make the implementation
> easier, it makes it harder.
[...]

Perhaps I had misunderstood something Chris had said.


> At a user experience level, the aim of the scoping limitation is
> essentially to help improve "code snippet portability".
> 
> Consider the following piece of code:
> 
>     squares = [x**2 for x in iterable]
> 
> In Python 2.x, you not only have to check whether or not you're already
> using "squares" for something, you also need to check whether or not you're
> using "x", since the iteration variable leaks.

I hear you, and I understand that some people had problems with leakage, 
but in my own experience, this was not a problem I ever had. On the 
contrary, it was occasionally useful (what was the last value x took 
before the comprehension finished?).

The change to Python 3 non-leaking behaviour has solved no problem for 
me but taken away something which was nearly always harmless and very 
occasionally useful. So I don't find this to be an especially compelling 
argument.

But at least comprehensions are intended to be almost entirely 
self-contained, so it's not actively harmful. But I can't say the same 
for additional sub-function scopes.


> For PEP 572, the most directly comparable example is code like this:
> 
>     # Any previous binding of "m" is lost completely on the next line
>     m = re.match(...)
>     if m:
>         print(m.groups(0))
> 
> In order to re-use that snippet, you need to double-check the surrounding
> code and make sure that you're not overwriting an "m" variable already used
> somewhere else in the current scope.

Yes. So what? I'm going to be doing that regardless of whether the 
interpreter places this use of m in its own scope or not. The scope as 
seen by the interpreter is not important. If all we cared about was 
avoiding name collisions, we could solve that by using 128-bit secret 
keys as variables:

    var_81c199e61e9f90fd023508aee3265ad9

We don't need multiple scopes to avoid name collisions, we just need to 
make sure they're all unique :-)

But of course readability counts, and we write code to be read by 
people, not for the convenience of the interpreter.

For that reason, whenever I paste a code snippet, I'm going to check the 
name and make a conscious decision whether to keep it or change it, and 
doing that means I have to check whether "m" is already in use 
regardless of whether or not the interpreter will keep the two (or 
more!) "m" variables. So this supposed benefit is really no benefit at 
all. I still am going to check "m" to see if it clashes.

To the extent that this proposal to add sub-function scoping encourages 
people to do copy-paste coding without even renaming variables to 
something appropriate for the function they're pasted into, I think this 
will strongly hurts readability in the long run.


> With PEP 572, you don't even need to look, since visibility of the "m" in
> the following snippet is automatically limited to the statement itself:
> 
>     if (re.match(...) as m):
>         print(m.groups(0))
>     # Any previous binding of "m" is visible again here, and hence a common
> source of bugs is avoided :)

Is this really a "common source of bugs"?

Do you really mean to suggest that we should be able to copy and paste a 
code snippet into the middle of a function without checking how it 
integrates with the surrounding code? Because that's what it seems that 
you are saying. And not only that we should be able to do so, but that 
it is important enough that we should add a feature to encourage it?

If people make a habit of pasting snippets of code into their functions 
without giving any thought to how it fits in with the rest of the 
function, then any resulting bugs are caused by carelessness and 
slap-dash technique, not the scoping rules of the language.

The last thing I want to read is a function where the same name is used 
for two or three or a dozen different things, because the author 
happened to copy code snippets from elsewhere and didn't bother renaming 
things to be more appropriate. Nevermind whether the interpreter can 
keep track of which is which, I'm worried about *my* ability to keep 
track of which is which.

I might be cynical about the professionalism and skills of the average 
programmer, but even I find it hard to believe that most people would 
actually do that. But since we're (surely?) going to be taking time to 
integrate the snippet with the rest of the function, the benefit of not 
having to check for duplicate variable names evaporates.

We (hopefully!) will be checking for duplicates regardless of whether 
they are scoped to a single statement or not, because we don't want to 
read and maintain a function with the same name "x" representing a dozen 
different things at different times.

I'm not opposed to re-using variable names for two different purposes 
within a single function. But when I do it, I do it because I made a 
conscious decision that:

(1) the name is appropriate for both purposes; and

(2) re-using the name does not lead to confusion or make the function 
hard to read.

I don't re-use names because I've copied some snippet and can't be 
bothered to change the names. And I don't think we should be adding a 
feature to enable and justify that sort of poor practice.

Comprehensions have their own scope, and that's at least harmless, if 
not beneficial, because they are self-contained single expressions. But 
this would add separate scopes to blocks:

def function():
    x = 1
    if (spam as x):
        ...
        while (ham as x):
            ...
   # much later, deep in the function
   # possibly after some or all of those blocks have ended
   ...
                process(x)  # which x is this?

This would be three different variables all with the same name "x". To 
track the current value of x I have to track each of the x variables 
and which is currently in scope.

I don't think we need sub-function scoping. I think it adds more 
complexity that outweighs whatever benefit it gives.



-- 
Steve


More information about the Python-ideas mailing list