Difference between readlines() and iterating on a file object?

Christopher T King squirrel at WPI.EDU
Fri Aug 13 10:58:08 EDT 2004


On Fri, 13 Aug 2004, Richard wrote:

> Can anyone tell me what the difference is between
> 
> for line in file.readlines( ):
> 
> and
> 
> for line in file:
> 
> where file is a file object returned from an open( ) call?

The first form slurps every line in the file into a list, and then goes 
through each item in the list in turn.

The second form skips the middleman, and simply goes through each line of 
the file in turn (no interim list is created).  In this context, file is 
acting as a generator.  Because a list isn't created, this form is both 
faster and consumes less memory, overall making it much more efficient 
than .readlines().

> I thought that they did the same thing, but the code I am using it in has
> this line called more than once on the same file object and the second time
> it is ran gives different results for each.

Assuming you don't prematurely exit the for loop or access the file in 
another manner while looping, both forms should give identical results.  
Otherwise...

> What is the difference in implementation?

Because first form slurps everything in at once, repeated calls to it
(with no intervening seek()s) will always return an empty list, whether 
the for loop was stopped prematurely or not.

On the other hand, since the second form only reads one line at a time
(using file.next()), if the for loop is stopped prematurely (e.g. via 
break), subsequent invocations will pick up right where the previous one 
left off.

Hope this helps.




More information about the Python-list mailing list