Basic file operation questions

Steven Bethard steven.bethard at gmail.com
Thu Feb 3 16:10:08 EST 2005


Caleb Hattingh wrote:
> Peter
> 
>> Yes, you can even write
>>
>> f = open("data.txt")
>> for line in f:
>>     # do stuff with line
>> f.close()
>>
>> This has the additional benefit of not slurping in the entire file at  
>> once.
> 
> Is there disk access on every iteration?   I'm guessing yes?  It 
> shouldn't  be an issue in the vast majority of cases, but I'm naturally 
> curious :)

Short answer:
No, it's buffered.

Long answer:
This buffer is actually what causes the problems in interactions between 
uses of the next method and readline, seek, etc:

py> f = file('temp.txt')
py> for line in f:
... 	print line,
... 	break
...
line 1
py> f.read()
''
py> for line in f:
... 	print line,
...
line 2
line 3

Using the iteration protocol (specificaly, when file.next is called) 
causes the file object to read part of the file into a buffer for the 
iterator.  The read method doesn't access the same buffer, and sees that 
  (because the file is so small) we've already seeked to the end of the 
file, so it returns '' to signal that the entire file has been read, 
even though we have not finished iterating.  The iterator however, which 
has access to the buffer, can still complete its iteration.

The moral of the story is that, in general, you should only use the file 
as an iterator after you are done calling read, readline, etc. unless 
you want to keep track of the file position and do an appropriate 
file.seek() call after each use of the iterator.

Steve



More information about the Python-list mailing list