Basic file operation questions
Steven Bethard
steven.bethard at gmail.com
Thu Feb 3 16:10:08 EST 2005
Caleb Hattingh wrote:
> Peter
>
>> Yes, you can even write
>>
>> f = open("data.txt")
>> for line in f:
>> # do stuff with line
>> f.close()
>>
>> This has the additional benefit of not slurping in the entire file at
>> once.
>
> Is there disk access on every iteration? I'm guessing yes? It
> shouldn't be an issue in the vast majority of cases, but I'm naturally
> curious :)
Short answer:
No, it's buffered.
Long answer:
This buffer is actually what causes the problems in interactions between
uses of the next method and readline, seek, etc:
py> f = file('temp.txt')
py> for line in f:
... print line,
... break
...
line 1
py> f.read()
''
py> for line in f:
... print line,
...
line 2
line 3
Using the iteration protocol (specificaly, when file.next is called)
causes the file object to read part of the file into a buffer for the
iterator. The read method doesn't access the same buffer, and sees that
(because the file is so small) we've already seeked to the end of the
file, so it returns '' to signal that the entire file has been read,
even though we have not finished iterating. The iterator however, which
has access to the buffer, can still complete its iteration.
The moral of the story is that, in general, you should only use the file
as an iterator after you are done calling read, readline, etc. unless
you want to keep track of the file position and do an appropriate
file.seek() call after each use of the iterator.
Steve
More information about the Python-list
mailing list