How to iterate the input over a particular size?

Steven D'Aprano steven at REMOVE.THIS.cybersource.com.au
Tue Dec 29 01:35:41 EST 2009


On Tue, 29 Dec 2009 00:49:50 -0500, John Posner wrote:

> On Sun, 27 Dec 2009 09:44:17 -0500, joy99 <subhakolkata1234 at gmail.com>
> wrote:
> 
>> Dear Group,
>>
>> I am encountering a small question.
>>
>> Suppose, I write the following code,
>>
>> input_string=raw_input("PRINT A STRING:")
>> string_to_word=input_string.split()
>> len_word_list=len(string_to_word)
>> if len_word_list>9:
>>              rest_words=string_to_word[9:]
>>              len_rest_word=len(rest_words)
>>              if len_rest_word>9:
>>                       remaining_words=rest_words[9:]
>>
>>
> Here's an issue that has not, I think, been addressed in this thread.
> The OP's problem is:
> 
> 1. Start with an indefinitely long string.
> 
> 2. Convert the string to a list, splitting on whitespace.
> 
> 3. Repeatedly return subslices of the list, until the list is exhausted.
> 
> This thread has presented one-chunk-at-a-time (e.g. generator/itertools)
> approaches to Step #3, but what about Step #2? I've looked in the Python
> documentation, and I've done some Googling, but I haven't found a
> generator version of the string function split(). Am I missing
> something?

"Indefinitely long" doesn't mean you can't use split.

But if you want a lazy splitter, here's a version which should do what 
you want:


def lazy_split(text):
    accumulator = []
    for c in text:
        if c in string.whitespace:
            if accumulator:
                yield ''.join(accumulator)
                accumulator = []
        else:
            accumulator.append(c)
    if accumulator:
        yield ''.join(accumulator)


Other alternatives are to use a regex to find runs of whitespace 
characters, then yield everything else; or to use the itertools.groupby 
function.




-- 
Steven



More information about the Python-list mailing list