Why chunks is not part of the python standard lib?

Wolfgang Maier wolfgang.maier at biologie.uni-freiburg.de
Thu May 2 10:02:06 EDT 2013


Oscar Benjamin <oscar.j.benjamin <at> gmail.com> writes:

> 
> On 2 May 2013 13:55, Chris Angelico <rosuav <at> gmail.com> wrote:
> > On Thu, May 2, 2013 at 10:52 PM, Oscar Benjamin
> > <oscar.j.benjamin <at> gmail.com> wrote:
> >> They are all easy to write as generator functions but to me the point
> >> of itertools is that you can do things more efficiently than a
> >> generator function. Otherwise code that uses a combination of
> >> itertools primitives is usually harder to understand than an
> >> equivalent generator function so I'd probably avoid using itertools.
> >
> > Aren't most of the itertools primitives written in Python anyway? If
> > your code is harder to understand, just write the generator function!
> 
> The documentation describes them by showing equivalent generator
> functions and there may be a pure Python version of the module but if
> you look here then you can see which are builtin for CPython:
> 
> http://hg.python.org/cpython/file/c3656dca65e7/Modules/itertoolsmodule.c#l4070
> 
> The list covers all of the documented itertools functions (actually I
> now realise that they're mostly types not functions).
> 
> Oscar
> 

Maybe it's helpful to agree on terminology here.
Looks like Oscar is using *itertools primitives* for things like zip_longest()
- (I'm referring to Python3 here, where the i in zip_longest has disappeared
and the former izip() has been replaced with the built-in zip()) -
, which the docs call "a core set of fast, memory efficient tools that are
useful by themselves or in combination" and that are listed in the doc's
subsection .1 ("Itertools functions"). I think these are written in C (and
explained by giving equivalent Python code).
Then there are the *recipes* (subsection .2), which consist of Python code
that combines and uses *primitves*. Among those is the *grouper recipe*,
which is a chunks() doing fillvalue padding on the last chunk. It's obvious
that you could write an analogous *grouper* that ignores a truncated last
chunk by using zip instead of zip_longest.
Now, Oscar's and my point is that by adding two more *primitives*, call them
zip_strict and zip_relaxed or whatever you like, written in C and behaving
analogous to zip_longest, you could write analogous *grouper recipes* that
raise an error or simply yield when the last chunk is truncated. Those
*recipes* would still have Python code of course, but they would return an
iterator implemented in C.






More information about the Python-list mailing list