Fast pythonic way to process a huge integer list

Peter Otten __peter__ at web.de
Thu Jan 7 05:21:03 EST 2016


high5storage at gmail.com wrote:

> I have a list of 163.840 integers. What is a fast & pythonic way to
> process this list in 1,280 chunks of 128 integers?

What kind of processing do you have in mind? 
If it is about numbercrunching use a numpy.array. This can also easily 
change its shape:

>>> import numpy
>>> a = numpy.array(range(12))
>>> a
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
>>> a.shape = (3, 4)
>>> a
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

If it's really only(!) under a million integers slicing is also good:

items = [1, 2, ...]
CHUNKSIZE = 128

for i in range(0, len(items), CHUNKSIZE):
    process(items[start:start + CHUNKSIZE])

If the "list" is really huge (your system starts swapping memory) you can go 
completely lazy:

from itertools import chain, islice

def chunked(items, chunksize):
    items = iter(items)
    for first in items:
        chunk = chain((first,), islice(items, chunksize-1))
        yield chunk
        for dummy in chunk:  # consume items that may have been skipped
                             # by your processing
            pass

def produce_items(file):
    for line in file:
        yield int(line)

CHUNKSIZE = 128  # this could also be "huge" 
                 # without affecting memory footprint

with open("somefile") as file:
    for chunk in chunked(produce_items(file), CHUNKSIZE):
        process(chunk)




More information about the Python-list mailing list