[Python-ideas] A comprehension scope issue in PEP 572

Tim Peters tim.peters at gmail.com
Wed May 9 12:18:07 EDT 2018


...

[Guido]
>> We should probably define what happens when you write [p := p for p in
>> range(10)]. I propose that this overwrites the loop control variable rather
>> than creating a second p in the containing scope -- either way it's probably
>> a typo anyway.

[Jacco van Dorp <j.van.dorp at deonet.nl>]
> My naive assumption would be both.

Since this is all about scope, while I'm not 100% sure of what Guido
meant, I assumed he was saying "p can only have one scope in the
synthetic function:  local or non-local, not both, and local is what I
propose".  For example, let's flesh out his example a bit more:

    p = 42
    [p := p for p in range(10) if p == 3]
    print(p) # 42?  3?  9?

If `p` is local to the listcomp, it must print 42.  If `p` is
not-local, it must print 9.  If it's some weird mixture of both, 3
makes most sense (the only time `p := p` is executed is when the `for`
target `p` is 3).

 > If it's just the insertion of a nonlocal statement like Tim suggested,

Then all occurrences of `p` in the listcomp are not-local, and the
example above prints 9..

> wouldn't the comprehension blow up to:
>
> def implicitfunc()
>   nonlocal p
>   templist = []
>   for p in range(10):
>     p = p
>     templist.append(p)
>   return templist
>
> ?

Yes.


> If it were [q := p for p in range(10)], it would be:
>
> def implicitfunc()
>   nonlocal q
>   templist = []
>   for p in range(10):
>     q = p
>     templist.append(q)
>   return templist

There's no question about that one, because `q` isn't _also_ used as a
`for` target.  There are two "rules" here:

1. A name appearing as a `for` target is local.  That's already the case.

2. All other names (including a name appearing as a binding-expression
    target) are not local.

Clearer?  If a name appears as both, which rule applies?  "Both" is
likely the worst possible answer, since it's incoherent ;-)  If a name
appears as both a `for` target and as a binding-expression target,
that particular way of phrasing "the rules" suggests #1 (it's local,
period) is the more natural choice.  And, whether Guido consciously
knows it or not, that's why he suggested it ;-)


> Why would it need to be treated differently ?

Because it's incoherent.  It's impossible to make the example above
print 3 _merely_ by fiddling the scope of `p`.  Under the covers, two
distinct variables would need to be created, both of which are named
`p` as far as the user can see.  For my extension of Guido's example:

def implicitfunc()
    nonlocal p
    templist = []
    for hidden_loop_p in range(10):
        if hidden_loop_p == 3:
            p = hidden_loop_p
            templist.append(hidden_loop_p)
    return templist


[Tim]
>> A compile-time error would be fine by me too.  Creating two meanings
>> for `p` is nuts - pick one in case of conflict.  I suggested before
>> that the first person with a real use case for this silliness should
>> get the meaning their use case needs, but nobody bit, so "it's local
>> then" is fine.

> x = x is legal. Why wouldn't p := p be ?

It's easy to make it "legal":  just say `p is local, period` or `p is
not local, period`.  The former will confuse people who think "but
names appearing as binding-expression targets are not local", and the
latter will confuse people who think "but names appearing as `for`
targets are local".

Why bother?  In the absence of an actual use case (still conspicuous
by absence), I'd be happiest refusing to compile such pathological
code.  Else `p is local, period` is the best pointless choice ;-)


More information about the Python-ideas mailing list