[Python-ideas] Batching/grouping function for itertools

Mathias Panzenböck grosser.meister.morti at gmx.net
Sun Dec 8 18:49:19 CET 2013


On 12/08/2013 05:44 AM, Amber Yust wrote:
> After seeing yet another person asking how to do this on #python (and having needed to do it in the past myself), I'm
> wondering why itertools doesn't have a function to break an iterator up into N-sized chunks.
>
> Existing possible solutions include both the "clever" but somewhat unreadable...
>
>      batched_iter = zip(*[iter(input_iter)]*n)
>
> ...and the long-form...
>
>      def batch(input_iter, n):
>          input_iter = iter(input_iter)
>          while True:
>              yield [input_iter.next() for _ in range(n)]
>

This function drops items if the length of the input sequence is not a multiple of n. Fix:

	def batch(it, n):
		it = iter(it)
		while True:
			slice = []
			for _ in range(n):
				try:
					slice.append(it.next())
				except StopIteration:
					if slice:
						yield slice
					return
			yield slice

> There doesn't seem, however, to be one clear "right" way to do this. Every time I come up against this task, I go back
> to itertools expecting one of the grouping functions there to cover it, but they don't.
>
> It seems like it would be a natural fit for itertools, and it would simplify things like processing of file formats that
> use a consistent number of lines per entry, et cetera.
>
> ~Amber
>



More information about the Python-ideas mailing list