[Python-ideas] zip_strict() or similar in itertools ?

Peter Otten __peter__ at web.de
Thu Apr 4 14:24:54 CEST 2013


Wolfgang Maier wrote:

> Dear all,
> the itertools documentation has the grouper() recipe, which returns
> consecutive tuples of a specified length n from an iterable. To do this,
> it uses zip_longest(). While this is an elegant and fast solution, my
> problem is that I sometimes don't want my tuples to be filled with a
> fillvalue (which happens if len(iterable) % n != 0), but I would prefer an
> error instead. This is important, for example, when iterating over the
> contents of a file and you want to make sure that it's not truncated.
> I was wondering whether itertools, in addition to the built-in zip() and
> zip_longest(), shouldn't provide something like zip_strict(), which would
> raise an Error, if its arguments aren't of equal length.
> zip_strict() could then be used in an alternative grouper() recipe.
> 
> By the way, right now, I am using the following workaround for this
> problem:
> 
> def iblock(iterable, bsize, strict=False):
>     """Return consecutive lists of bsize items from an iterable.
> 
>     If strict is True, raises a ValueError if the size of the last block
>     in iterable is smaller than bsize. If strict is False, it returns the
>     truncated list instead."""
>     
>     it=iter(iterable)
>     i=[it]*(bsize-1)
>     while True:
>         try:
>             result=[next(it)]
>         except StopIteration:
>             # iterator exhausted, end the generator
>             break
>         for e in i:
>             try:
>                 result.append(next(e))
>             except StopIteration:
>                 # iterator exhausted after returning at least one item,
>                 # but before returning bsize items
>                 if strict:
>                     raise ValueError("only %d value(s) left in iterator,
> expected %d" % (len(result),bsize))
>                 else:
>                     pass
>         yield result
> 
> , which works well, but is about 3-4 times slower than the grouper()
> recipe. If you have alternative, faster solutions that I wasn't thinking
> of, I'd be very interested to here about them.
> 
> Best,
> Wolfgang

A simple approach is

def strict_grouper(items, size, strict):
    fillvalue = object()
    args = [iter(items)]*size
    chunks = zip_longest(*args, fillvalue=fillvalue)
    prev = next(chunks)

    for chunk in chunks:
        yield prev
        prev = chunk

    if prev[-1] is fillvalue:
        if strict:
            raise ValueError
        else:
            prev = prev[:prev.index(fillvalue)]
                    
    yield prev


If that's fast enough it might be a candidate for the recipes section.

A partial solution I wrote a while a go is

http://code.activestate.com/recipes/497006-zip_exc-a-lazy-zip-that-ensures-that-all-iterables/




More information about the Python-ideas mailing list