chunking a long string?

Steven D'Aprano steve+comp.lang.python at pearwood.info
Fri Nov 8 19:54:13 EST 2013


On Fri, 08 Nov 2013 12:48:12 -0500, Roy Smith wrote:

> I have a long string (several Mbytes).  I want to iterate over it in
> manageable chunks (say, 1 kbyte each).  For (a small) example, if I
> started with "this is a very long string", and I wanted 10 character
> chunks, I should get:
> 
> "this is a "
> "very long "
> "string"
> 
> This seems like something itertools would do, but I don't see anything. 
> Is there something, or do I just need to loop and slice (and worry about
> getting all the edge conditions right) myself?

What edge conditions? Should be trivially easy to loop and slice:

def grouper(string, size):
    i = 0
    while i <= len(string):
        yield string[i:i+size]
        i += size


But if you prefer, there is a recipe in the itertools documentation to 
solve this problem for you:

http://docs.python.org/2/library/itertools.html#recipes

It's short enough to reproduce here.

from itertools import izip_longest
def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
    args = [iter(iterable)] * n
    return izip_longest(fillvalue=fillvalue, *args)

grouper(your_string, 10, '')

ought to give you the results you want.


I expect (but haven't tested) that for strings, the slice version will be 
faster.


-- 
Steven



More information about the Python-list mailing list