Slicing iterables in sub-generators without loosing elements

Thomas Bach thbach at students.uni-mainz.de
Sat Sep 29 12:14:38 EDT 2012


Hi,

say we have the following:

>>> data = [('foo', 1), ('foo', 2), ('bar', 3), ('bar', 2)]

is there a way to code a function iter_in_blocks such that

>>> result = [ list(block) for block in iter_in_blocks(data) ]

evaluates to

>>> result = [ [('foo', 1), ('foo', 2)], [('bar', 3), ('bar', 2)] ]

by _only_ _iterating_ over the list (caching all the elements sharing
the same first element doesn't count)?

I came up with the following

def iter_in_blocks(iterable):
    my_iter = iter(iterable)
    while True:
        first = next(my_iter)
        pred = lambda entry: entry[0] == first[0]
        def block_iter():
            yield first
            for entry in itertools.takewhile(pred, my_iter):
                yield entry
        yield block_iter()

which does not work as itertools.takewhile consumes the first entry
not fulfilling the pred.

I currently have the intuition that the problem is not solvable
without using e.g. a global to pass something back to iter_in_blocks
from block_iter. Any other suggestions?

Regards,
	Thomas Bach.



More information about the Python-list mailing list