getting n items at a time from a generator

NickC ncoghlan at gmail.com
Tue Jan 1 00:06:53 EST 2008


On Dec 27 2007, 11:31 pm, Kugutsumen <kugutsu... at gmail.com> wrote:
> On Dec 27, 7:24 pm, Terry Jones <te... at jon.es> wrote:
>
>
>
> > >>>>> "Kugutsumen" == Kugutsumen  <kugutsu... at gmail.com> writes:
>
> > Kugutsumen> On Dec 27, 7:07 pm, Paul Hankin <paul.han... at gmail.com> wrote:
>
> > >> On Dec 27, 11:34 am, Kugutsumen <kugutsu... at gmail.com> wrote:
>
> > >> > I am relatively new the python language and I am afraid to be missing
> > >> > some clever construct or built-in way equivalent to my 'chunk'
> > >> > generator below.
>
> > Kugutsumen> Thanks, I am going to take a look at itertools.  I prefer the
> > Kugutsumen> list version since I need to buffer that chunk in memory at
> > Kugutsumen> this point.
>
> > Also consider this solution from O'Reilly's Python Cookbook (2nd Ed.) p705
>
> >     def chop(iterable, length=2):
> >         return izip(*(iter(iterable),) * length)
>
> > Terry
> > [snip code]
>
> > Try this instead:
>
> > import itertools
>
> > def chunk(iterator, size):
> >     # I prefer the argument order to be the reverse of yours.
> >     while True:
> >         chunk = list(itertools.islice(iterator, size))
> >         if chunk: yield chunk
> >         else: break
>
> Steven, I really like your version since I've managed to understand it
> in one pass.
> Paul's version works but is too obscure to read for me :)
>
> Thanks a lot again.

To work with an arbitrary iterable, it needs an extra line at the
start to ensure the iterator items are consumed correctly each time
around the loop. It may also be better to ensure the final item is the
same length as the other items - if that isn't the case (you want to
know where the data really ends) then leave out the parts relating to
adding the padding object.

import itertools

def chunk(iterable, size, pad=None):
    iterator = iter(iterable)
    padding = [pad]
    while True:
        chunk = list(itertools.islice(iterator, size))
        if chunk:
            yield chunk + (padding*(size-len(chunk)))
        else:
            break

Cheers,
Nick.



More information about the Python-list mailing list