Why chunks is not part of the python standard lib?

Oscar Benjamin oscar.j.benjamin at gmail.com
Wed May 1 05:00:04 EDT 2013


On 1 May 2013 08:10, Mark Lawrence <breamoreboy at yahoo.co.uk> wrote:
> On 01/05/2013 07:26, Ricardo Azpeitia Pimentel wrote:
>>
>> After reading How do you split a list into evenly sized chunks in
>> Python?
>>
>> <http://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks-in-python>
>>
>> and seeing this kind of mistakes happening
>> https://code.djangoproject.com/ticket/18972 all the time.
>>
>> Why is not a |chunks| function in itertools?
>>
>> |grouper| from
>> http://docs.python.org/2/library/itertools.html#recipes doesn't have the
>> same behavior as |chunks |
>>
>> Example:
>> |
>>
>> |chunks([1,  2,  3,  4,  5],  3)
>> # Should return [[1, 2, 3], [4, 5]] or the iterator equivalent.|
>>
>> |Original Post on StackOverflow:
>>
>> http://stackoverflow.com/questions/16313008/why-chunks-is-not-part-of-the-python-standard-lib
>>
>
> Asked and answered a trillion times.  There's no concensus on how chucks
> should behave.

I'm not sure that's a valid argument against it since a chunks
function could just do a different thing depending on the arguments
given.

The issue is around how to deal with the last chunk if it isn't the
same length as the others and I can only think of 4 reasonable
responses:

1) Yield a shorter chunk
2) Extend the chunk with fill values
3) Raise an error
4) Ignore the last chunk

Cases 2 and 4 can be achieved with current itertools primitives e.g.:
2) izip_longest(fillvalue=fillvalue, *[iter(iterable)] * n)
4) zip(*[iter(iterable)] * n)

However I have only ever had use cases for 1 and 3 and these are not
currently possible without something additional (e.g. a generator
function).

In any case a chunks function can simply take arguments to give all 4
behaviours:

def chunks(iterable, chunksize, uneven='return_short', fillvalue=None):
   # loop through yielding all even chunks
   # and then
   if uneven == 'return_short:
      yield chunk
   elif uneven == 'raise':
      raise ValueError('No items left')
   elif uneven == 'fill':
      yield chunk + [fillvalue] * (chunksize - len(chunk))
   elif uneven == 'ignore':
      pass


Oscar



More information about the Python-list mailing list