Split iterator into multiple streams

Ian ian.g.kelly at gmail.com
Sat Nov 6 05:24:13 EDT 2010


On Nov 6, 2:52 am, Steven D'Aprano <st... at REMOVE-THIS-
cybersource.com.au> wrote:
> My first attempt was this:
>
> def split(iterable, n):
>     iterators = []
>     for i, iterator in enumerate(itertools.tee(iterable, n)):
>         iterators.append((t[i] for t in iterator))
>     return tuple(iterators)
>
> But it doesn't work, as all the iterators see the same values:

Because the value of i is not evaluated until the generator is
actually run; so all the generators end up seeing only the final value
of i rather than the intended values.  This is a common problem with
generator expressions that are not immediately run.

> I tried changing the t[i] to use operator.itergetter instead, but no
> luck. Finally I got this:
>
> def split(iterable, n):
>     iterators = []
>     for i, iterator in enumerate(itertools.tee(iterable, n)):
>         f = lambda it, i=i: (t[i] for t in it)
>         iterators.append(f(iterator))
>     return tuple(iterators)
>
> which seems to work:
>
> >>> data = [(1,2,3), (4,5,6), (7,8,9)]
> >>> a, b, c = split(data, 3)
> >>> list(a), list(b), list(c)
>
> ([1, 4, 7], [2, 5, 8], [3, 6, 9])
>
> Is this the right approach, or have I missed something obvious?

That avoids the generator problem, but in this case you could get the
same result a bit more straight-forwardly by just using imap instead:

def split(iterable, n):
    iterators = []
    for i, iterator in enumerate(itertools.tee(iterable, n)):
        iterators.append(itertools.imap(operator.itemgetter(i),
iterator))
    return tuple(iterators)

>>> map(list, split(data, 3))
[[1, 4, 7], [2, 5, 8], [3, 6, 9]]

Cheers,
Ian



More information about the Python-list mailing list