Lazy file.readlines()?

meow meowing at banet.net
Wed Sep 15 21:07:32 EDT 1999


In nnmh:python, William Tanksley <wtanksle at dolphin.openprojects.net> wrote:

> I wonder -- suppose the official Python were to replace the current
> readlines() with this one (or with something similar).  Would it be
> possible to preserve all the current behavior (except for the memory
> usage, of course)?

It's tough.  I've been trying to do that with my attempt, and the
performance hit of keeping the seeks synchronized makes it not worth
the trouble (IMHO) to support writing or having more than one object
able to read from the same descriptor.

If a read only object is good enough, a cleaned-up version of the QIO
hack thingy I mentioned a few days ago is uploaded.  I think I've got
its methods, attributes and exceptions down to a faithful reproduction
of Python's native file object now (except for writing, of
course). readinto() is absent for now since it's undocumented.  It can
go in if someone can supply some Python k0dez showing what buffer
objects are good for =)

Also added some convenience features like user-specified line
terminators, automatic chopping, and a way to read SMTP-like sockets
wicked fast.

After adding the requisite bloat, vanilla qio.readline() is still
about 5-8 times as fast as native Python looping through big text
files (where stdio is glibc's).  readlines() and read() are not big
wins (and in pathological cases they're slower), but they're there for
completeness.

Details at <URL:http://members.xoom.com/meowing/python/>

Meow.




More information about the Python-list mailing list