[Python-ideas] Proposal: itertools.batch

Blagoj Petrushev b.petrushev at gmail.com
Tue Apr 9 22:21:03 CEST 2013


Hello,

I have an idea for a new function in the itertools module. I've been
using this pattern quite a lot, so maybe someone else would think it
is useful as well.

The purpose is to split an iterable into batches with fixed size, and
each yielded batch should be an iterator as well.

def batch(iterable, batch_size):
    exhausted = False
    batch_range = range(batch_size)
    while not exhausted:
        def current():
            nonlocal exhausted
            for _ in batch_range:
                try:
                    yield next(iterable)
                except StopIteration:
                    exhausted = True
        yield current()

There are problems with this implementation:
- the use of try/except is an overkill (the exception is raised only
once, so maybe it's not that scarry)
- it goes on forever if the batches are not actually consumed
- it yields additional empty iterator if the original iterable's
length is an exact multiple of batch_size.

Here is a simplified version which yields batches as tuples (this is
the variation I use in practice):

def batch_tuples(iterable, batch_size):
    while True:
        batch_ = tuple(itertools.islice(iterable, 0, batch_size))
        if len(batch_) == 0: break
        yield batch_

This is my first proposal, and first email on this list, so take it
easy on me :)

petrushev
---------
https://github.com/petrushev



More information about the Python-ideas mailing list