mixing for x in file: and file.readline
Russell E. Owen
owen at astro.washington.edu
Fri Sep 12 13:57:47 EDT 2003
In article <mailman.1063364898.12243.python-list at python.org>,
Oren Tirosh <oren-py-l at hishome.net> wrote:
>On Thu, Sep 11, 2003 at 01:54:53PM -0700, Russell E. Owen wrote:
>> At one time, mixing for x in file and readline was dangerous. For
>> example:
>>
>> for line in file:
>> # read some lines from a file, then break
>> nextline = readline() # bad
>>
>> would not do what a naive user might expect because the file iterator
>> buffered data and readline did not read from that buffer. Hence the call
>> to readline might unexpectedly skip some lines...
(Oren points out that it's still a problem in Python 2.3 and after some
interesting and gory detail goes on to say...)
>Really fixing it amounts to reimplementing the entire I/O layer of
>Python with a different strategy and thoroughly testing on multiple
>platforms.
>
>It's possible to hide the problem in most cases by making read and
>readline use the iteration readahead buffer if it's attached to the file
>object and stdio if it isn't. I don't think it's a good idea. It will
>require some hairy code and and seems susceptible to subtle bugs and
>corner cases.
I agree that fixing read would probably be too messy to justify.
But it seems to me that a simple reimplementation of readline() would
work fine:
def readline(self):
try:
return self.next()
except StopIteration
return ""
That's basically the way I ended up working around the problem (but I
didn't try to modify any classes). I do see two issues with that fix:
- existing code (if any) that mixes readlines and read would be harmed
- it may not be efficient enough (even implemented in C)
>Another alternative it to make read and readline fail noisily after
>iteration starts (unless cleared by seek())
If readlines cannot be fixed, this might be worth doing since I think
it's a common thing to want to mix readlines and iteration. If read is
the only issue, I suspect adding a warning to the documentation for file
method "read" would suffice.
I'm wondering where the problem is discussed in the manual. I'm pretty
sure I saw it recently, but when I read about file methods I saw nothing
about it.
-- Russell
More information about the Python-list
mailing list