List comprehension/genexp inconsistency.

Ian Kelly ian.g.kelly at gmail.com
Tue Mar 20 18:50:03 EDT 2012


On Tue, Mar 20, 2012 at 3:16 PM, Dennis Lee Bieber
<wlfraed at ix.netcom.com> wrote:
> On Tue, 20 Mar 2012 16:23:22 -0400, "J. Cliff Dyer"
> <jcd at sdf.lonestar.org> declaimed the following in
> gmane.comp.python.general:
>
>>
>> When trying to create a class with a dual-loop generator expression in a
>> class definition, there is a strange scoping issue where the inner
>> variable is not found, (but the outer loop variable is found), while a
>> list comprehension has no problem finding both variables.
>>
>        Read http://www.python.org/dev/peps/pep-0289/ -- in particular, look
> for the word "leak"

No, this has nothing to do with the loop variable leaking.  It appears
to have to do with the fact that the variables and the generator
expression are inside a class block.  I think that it's related to the
reason that this doesn't work:

class Foo(object):
    x = 42
    def foo():
        print(x)
    foo()

In this case, x is not a local variable of foo, nor is it a global.
In order for foo to access x, it would have to be a closure -- but
Python can't make it a closure in this case, because the variable it
accesses is (or rather, will become) a class attribute, not a local
variable of a function that can be stored in a cell.  Instead, the
compiler just makes it a global reference in the hope that such a
global will actually be defined when the code is run.

For that reason, what surprises me about Cliff's example is that a
generator expression works at all in that context.  It seems to work
as long as it contains only one loop, but not if it contains two.  To
find out why, I tried disassembling one:

>>> class Foo(object):
...     x = 42
...     y = 12
...     g = (a+b for a in range(x) for b in range(y))
...
>>> dis.dis(Foo.g.gi_code)
  4           0 LOAD_FAST                0 (.0)
        >>    3 FOR_ITER                34 (to 40)
              6 STORE_FAST               1 (a)
              9 LOAD_GLOBAL              0 (range)
             12 LOAD_GLOBAL              1 (y)
             15 CALL_FUNCTION            1
             18 GET_ITER
        >>   19 FOR_ITER                15 (to 37)
             22 STORE_FAST               2 (b)
             25 LOAD_FAST                1 (a)
             28 LOAD_FAST                2 (b)
             31 BINARY_ADD
             32 YIELD_VALUE
             33 POP_TOP
             34 JUMP_ABSOLUTE           19
        >>   37 JUMP_ABSOLUTE            3
        >>   40 LOAD_CONST               0 (None)
             43 RETURN_VALUE

So that explains it.  Notice that "x" is never actually accessed in
that disassembly; only "y" is.  It turns out that the first iterator
[range(x)] is actually created before the generator ever starts
executing, and is stored as an anonymous local variable on the
generator's stack frame -- so it's created in the class scope, not in
the generator scope.  The second iterator, however, is recreated on
every iteration of the first iterator, so it can't be pre-built in
that manner.  It does get created in the generator scope, and when
that happens it blows up because it can't find the variable, just like
the function example above.

Cheers,
Ian



More information about the Python-list mailing list