generator / iterator mystery

Peter Otten __peter__ at web.de
Sun Mar 13 18:31:35 EDT 2011


Dave Abrahams wrote:

>>>> list(chain(  *(((x,n) for n in range(3)) for x in 'abc')  ))
> [('c', 0), ('c', 1), ('c', 2), ('c', 0), ('c', 1), ('c', 2), ('c', 0),
> [('c', 1), ('c', 2)]
> 
> Huh?  Can anyone explain why the last result is different?
> (This is with Python 2.6)

The *-operator is not lazy, so the outer generator will be exhausted before 
anything is passed to the chain() function. You can see what will be passed 
with

>>> generators = [((x, n) for n in range(3)) for x in "abc"]

x is defined in the enclosing scope, and at this point its value is

>>> x 
'c'

i. e. what was assigned to it in the last iteration of the list 
comprehension. Because of Python's late binding all generators in the list 
see this value:

>>> next(generators[0])
('c', 0)

>>> [next(g) for g in generators]
[('c', 1), ('c', 0), ('c', 0)]

Note that unlike list comps in 2.x generator expressions do not expose their 
iterating vars and therefore are a bit harder to inspect

>>> del x
>>> generators = list((((x, n) for n in range(3)) for x in "abc"))
>>> x
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'x' is not defined

...but the principle is the same:

>>> g = generators[0]
[snip a few dir(...) calls]
>>> g.gi_frame.f_locals
{'.0': <listiterator object at 0x7f936516dc50>, 'x': 'c'}

One way to address the problem is to make a separate closure for each of the 
inner generators which is what you achieved with your enum3() function; 
there the inner generator sees the local x of the current enum3() call. 
Another fix is to use chain.from_iterable(...) instead of chain(*...):

>>> list(chain.from_iterable(((x, n) for n in range(3)) for x in "abc"))
[('a', 0), ('a', 1), ('a', 2), ('b', 0), ('b', 1), ('b', 2), ('c', 0), ('c', 
1), ('c', 2)]

Here the outer generator proceeds to the next x only when the inner 
generator is exhausted.

Peter




More information about the Python-list mailing list