How to read from a file to an arbitrary delimiter efficiently?

Steven D'Aprano steve at pearwood.info
Sat Feb 27 04:49:43 EST 2016


On Thu, 25 Feb 2016 06:30 pm, Chris Angelico wrote:

> On Thu, Feb 25, 2016 at 5:50 PM, Steven D'Aprano
> <steve+comp.lang.python at pearwood.info> wrote:
>>
>> # Read a chunk of bytes/characters from an open file.
>> def chunkiter(f, delim):
>>     buffer = []
>>     b = f.read(1)
>>     while b:
>>         buffer.append(b)
>>         if b in delim:
>>             yield ''.join(buffer)
>>             buffer = []
>>         b = f.read(1)
>>     if buffer:
>>         yield ''.join(buffer)
> 
> How bad is it if you over-read? 

Pretty bad :-)

Ideally, I'd rather not over-read at all. I'd like the user to be able to
swap from "read N bytes" to "read to the next delimiter" (and possibly
even "read the next line") without losing anything.


If there's absolutely no other way to speed this up by at least a factor of
ten, I'll consider reading into a buffer and losing the ability to mix
different kinds of reads.




-- 
Steven




More information about the Python-list mailing list