Big time WTF with generators - bug?

Peter Otten __peter__ at web.de
Wed Feb 13 04:31:39 EST 2008


James Stroud wrote:

groupby() is "all you can eat", but "no doggy bag".

> def serialize(table, keyer=_keyer,
>                       selector=_selector,
>                       keyfunc=_keyfunc,
>                       series_keyfunc=_series_keyfunc):
>    keyed = izip(imap(keyer, table), table)
>    filtered = ifilter(selector, keyed)
>    serialized = groupby(filtered, series_keyfunc)
>    serieses = []
>    for s_name, series in serialized:
>      grouped = groupby(series, keyfunc)
>      regrouped = ((k, (v[1] for v in g)) for (k,g) in grouped)
>      serieses.append((s_name, regrouped))

You are trying to store a group for later consumption here. 

>    for s in serieses:
>      yield s

That doesn't work:

>>> groups = [g for k, g in groupby(range(10), lambda x: x//3)]
>>> for g in groups:
...     print list(g)
...
[]
[]
[]
[9]

You cannot work around that because what invalidates a group is the call of
groups.next():

>>> groups = groupby(range(10), lambda x: x//3)
>>> g = groups.next()[1]
>>> g.next()
0
>>> groups.next()
(1, <itertools._grouper object at 0x2b3bd1f300f0>)
>>> g.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Perhaps Python should throw an out-of-band exception for an invalid group
instead of yielding bogus data.

Peter




More information about the Python-list mailing list