large file support

Bernhard Herzog herzog at online.de
Sat Oct 21 07:03:34 EDT 2000


fred at balzac.scd.ucar.edu (Fred Clare) writes:

> How does one read large files into Python?  I have a 4 gigabyte
> file on an SGI (that has over 30 GB main memory), but when I use
> 
>   f = open("file","rb")
>   s = f.read()
> 
> on this file, I get "MemoryError".  Trying the I/O from the os module 
> produces the same result.  In fact I get the MemoryError even when 
> trying to read 2GB files.  When I run configure before building python, it
> indicates that large file support *is* enabled.

The reason is probably that Python strings store their length as an int,
And that's probably signed 32bit on your SGI as well. The implementation
of read() in 1.5.2 doesn't check the size of the file properly against
INT_MAX so you likely end up with a negative size that passed to
PyString_FromStringAndSize and then through to malloc. In 2.0, you
should get an OverflowError when you try to read a file larger than
INT_MAX, if I read the sources correctly.

Of course, this only applies to reading the file in one go. If you read
it in chunks < 2GB you should be fine, as long as large file support
works (which I simple don't know).

-- 
Bernhard Herzog   | Sketch, a drawing program for Unix
herzog at online.de  | http://sketch.sourceforge.net/



More information about the Python-list mailing list