Simple looping question...

David Allen mda at idatar.com
Mon Apr 2 23:14:30 EDT 2001


In article <mailman.986258884.3396.python-list at python.org>, "Ken Seehof"
<kens at sightreader.com> wrote:
>>>> f = open(r'd:\qp\var\publish\index.html') z = f.readlines()
>>>> type(z)
> <type 'list'>
> 
> It would appear that readlines() returns a list.  Therefore the entire file
> is read in to create that list before returning. On the other hand, I have
> written a generator class that does what you are saying.  In order for
> readlines() to be smart, it would have to return a generator instead of a
> list.

I didn't mean in that case.  It's clear that 
foo = file.readlines()
creates an array and puts it in foo.  I was talking
about in the case of:

for line in file.readlines():
  print line

There, python has two choices.  It could either
create an entire array holding the entire file, and
loop through assigning each member of the array to
line each go through.  Or, it could not allocate
an array at all, and just treat that statement as
the equivalent of a long series of file.readline()
statements.

What I meant was that I think this is the way python
handles it when file.readlines() is the set of things
you're talking about in a for loop.

I don't have code proof for that - it's just that
I've run a program that uses that type of a construct:

for line in file.readlines():
  # Do something with line

in several programs that get fed 80MB data files, 
and I've never noticed python's memory usage 
encompassing 80 MB, which is what you would expect
if the interpreter actually allocated an array to 
keep the whole file in memory.

-- 
David Allen
http://opop.nols.com/
----------------------------------------
Genius may have its limitations, but stupidity is not thus 
handicapped. 
- Elbert Hubbard



More information about the Python-list mailing list