iterating over lines in a file

Donn Cave donn at u.washington.edu
Fri Jul 21 12:44:20 EDT 2000


Quoth David Bolen <db3l at fitlinxx.com>:
| cjc26 at nospam.cornell.edu (Cliff Crawford) writes:
|> readlines() has an optional argument which specifies the approximate
|> number of bytes to read in at a time, rather than the entire file.
|> So something like
|> 
|> for line in file.readlines(8192):
|>     # process line
|> 
|> would only use about 8k of memory.
|
| And only fully process files less than 8K in size.  The call to
| file.readlines(8192) returns the list of lines contained within the
| first 8K of the file (approximately), and that's all the 'for' is
| going to iterate over.  You have to repeatedly call file.readlines()
| again to keep reading the file, which puts you pretty much back in the
| original readline() mode, just with bigger chunks.

Sure, but it's a useful idiom where the expected file is under
ca. 500 bytes in any sane case.  The upper limit keeps the potential
insane case from wiping out the program.  (Numbers arbitrary.)
Not an approach we would always want to take, but there could be
a place for it.

	Donn Cave, donn at u.washington.edu



More information about the Python-list mailing list