[Python-ideas] A comprehension scope issue in PEP 572

Tim Peters tim.peters at gmail.com
Mon May 14 03:18:17 EDT 2018


[Tim]
>> """
>> An assignment expression binds the target, except in a function F
>> synthesized to implement a list comprehension or generator expression
>> (see XXX).  In the latter case[1], the target is bound in the block
>> containing F, and errors may be detected:  If the target also appears
>> as an identifier target of a `for` loop header in F, a `SyntaxError`
>> exception is raised.  If the block containing F is a class block, a
>> `SyntaxError` exception is raised.
>>
>> Footnote:
>> [1] The intent is that runtime binding of the target occurs as if the
>> binding were performed in the block containing F.  Because that
>> necessarily makes the target not local in F, it's an error if the
>> target also appears in a `for` loop header, which is a local binding
>> for the same target.  If the containing block is a class block, F has
>> no access to that block's scope, so it doesn't make sense to consider
>> the containing block.  The target is bound in the containing block,
>> where it inherits that block's `global` or `nonlocal` declaration if
>> one exists, else establishes that the target is local to that block.
>> """

[Nick]
> This is getting pretty close to being precise enough to be at least
> potentially implementable (thanks!), but there are still two cases that
> would need to be covered:
>
> - what happens inside a lambda expression?

Section 4 of the Reference Manual doesn't contain the word "lambda",
because there's no need to.  "lambda" is just another way to create a
function, and the behavior of functions is already specified.

If you disagree, you mean something by "lambda expression" other than
what I take it to mean, and the best way to illustrate what you do
mean would be via giving a concrete example.  As far as I'm concerned,

   ... (lambda...:  expression) ...

is exactly the same as

    def _hidden_unique_name(...):
        return expression
    ... (_hidden_unique_name) ....

Even at class scope ;-)

For example,

    def f():
        g = lambda n:  [(n := n+1) for i in range(1)]
        return g(10)

is the same as:

    def f():
        def g(n):
            return  [(n := n+1) for i in range(1)]
        return g(10)

When the listcomp synthesizes a function, g's code block will
immediately contain it.  The text at the top says `n` is bound in the
containing block - which is g's.  `n` is already local to `g`, so that
part is just redundant in this case.  The synthetic function will take
10 (via its nonlocal cell), add 1, and again via the cell rebind g's
`n` to 11.  The synthetic function returns [11] and the rebound `n`
vanishes with its execution frame.

But that's all about how functions behave; "lambda" is just incidental.


> - what happens inside another comprehension or generator expression?

I don't know what you're asking about, so I'll just copy part of a
different reply:

"""
Where are the docs explaining how nested comprehensions work today?  I
haven't seen any.  If any such exist, I'd bet nothing about them needs
to be changed.  If none such exist, I don't see a need to write any
just for this PEP.  How do nested expressions of any kind work?  Same
thing.

The only thing the suggestion changes is the scope of assignment
expression targets in synthetic functions created to implement
comprehensions.  That has nothing at all to do with the possibility of
nesting, or with the structure of nesting.  Why do you think it does -
or might?
"""

Note that a day or two ago I posted a complete expansion of what

    list(i + sum((i := i+1) for j in range(i)) + i
          for i in range(5))

would do.  There the inner genexp rebinds the outer genexp's local
for-target.  Atrocious.  Here's how you can do the same:

Replace `(i := i+1)` with `(LATER)`.

Generate the complete expansion for how the assignment-expression-free
derived statement is implemented, presumably following the detailed
docs that don't appear to exist ;-) for how that's done.

In the synthetic function created for the inner genexp, add

    nonlocal i

at the top and replace

    yield (LATER)

with

     yield (i := i+1)

Now I didn't replace anything with "LATER", because I never thought
adding a new binary operator had anything to do with this process to
begin with ;-)  All that's needed is to add cruft _after_ it's done to
establish the intended scopes for assignment statement targets.

If you ask how the inner genexp "knows" that it needs to access the
outer genexp's `i`, it doesn't directly.  It's simply that the
synthetic inner genexp function is nested inside the synthetic outer
genexp function, and the text at top says that the inner genexp binds
`i` in its containing block - which is the block for the synthetic
outer genexp.

If the functions synthesized for nested comprehensions don't _already_
nest in this way, then they'd already screw up merely accessing outer
comprehension names from within inner comprehensions.


Is there a reason to suspect that there's anything inherently unique
to that specific example?  I did a bunch of these "by hand", _trying_
to create problems, but didn't manage to.  The only times I got even
slightly flustered were when brain fog temporarily blocked my ability
to see how to generate the nested messes _entirely independent_ of
that they happened to contain assignment expressions.

As I also posted about earlier, the real problems I've seen were in
corner cases _without_ the suggested change, stemming from that
different pieces of a comprehension execute in different scopes.  That
can have ugly consequences when an assignment expression appears in
the outermost for's iterable, and its target is also in the body of
the comprehension.  That doesn't even require nesting to become
incoherent.

    [y for _ in range(y := 42)]
    [y for y in range(y := 42)]

With the suggestion, a binding expression target  resolves to the same
scope regardless of where it appears in the genexp/listcomp, so that
class of head-scratcher vanishes ("same scope as in the containing
block" implies that all instances of the target resolve to the same
scope, which, happily enough, is resolved in the very block the
outermost iterable _is_ executed in; so in the first example above
both instances of `y` are resolved in the block containing the
listcomp, and the second example is a compile-time error for the
coherent reason given in the text at the top).


More information about the Python-ideas mailing list