Warning about "for line in file:"

Brian Kelley bkelley at wi.mit.edu
Fri Feb 15 15:47:26 EST 2002


Neil Schemenauer wrote:

> Russell E. Owen wrote:
> 
>>The readline or xreadline file methods work fine, of course.
>>
> 
> Why "of course"?  iter(file) does the same thing as file.xreadlines().
> Have you tested xreadlines?
> 
>   Neil
> 
> 


I had the same problem with xreadlines but "for line in file" is MUCH 
less explicit and leads to erros like this.

file = open(...)

count = 0
for line in file:
     if count > 10: break
     print line
     count = count + 1

for line in file:
     print line

Doesn't work like I would expect.  This is essentially doing the following:

file = open(...)

count = 0
for line in file.xreadlines():
     if count > 10: break
     print line
     count = count + 1

for line in file.xreadlines():
     print line

So what is REALLY happening is that you are creating two seperate 
iterators in the above examples.  Writing "for line in file" instead of 
"for line in file.xreadlines()" simply hides and confuses this.

The problem with spawning multiple iterators is that their is a read 
cache going on behind the scenes and file.xreadlines() doesn't rewind 
the file to the starting point.

The behavior that I would expect is exemplified in the following correct 
example that actually prints out all lines of a file.

file = open(...)

count = 0
iterator = iter(file)
for line in iterator:
     if count > 10: break
     print line
     count = count + 1

for line in iterator:
     print line

So from my point of view, unless you are reading the whole file, don't 
use "for line in file"

Brian Kelley
Whitehead Institute for Biomedical Research





More information about the Python-list mailing list