getting n items at a time from a generator

Kugutsumen kugutsumen at gmail.com
Thu Dec 27 06:34:57 EST 2007


I am relatively new the python language and I am afraid to be missing
some clever construct or built-in way equivalent to my 'chunk'
generator below.

def chunk(size, items):
    """generate N items from a generator."""
    chunk = []
    count = 0
    while True:
        try:
            item = items.next()
            count += 1
        except StopIteration:
            yield chunk
            break
        chunk.append(item)
        if not (count % size):
            yield chunk
            chunk = []
            count = 0

>>> t = (i for i in range(30))
>>> c = chunk(7, t)
>>> for i in c:
...     print i
...
[0, 1, 2, 3, 4, 5, 6]
[7, 8, 9, 10, 11, 12, 13]
[14, 15, 16, 17, 18, 19, 20]
[21, 22, 23, 24, 25, 26, 27]
[28, 29]

In my real world project, I have over 250 million items that are too
big to fit in memory and that processed and later used to update
records in a database... to minimize disk IO, I found it was more
efficient to process them by batch or "chunk" of 50,000 or so. Hence

Is this the proper way to do this?




More information about the Python-list mailing list