[Python-Dev] open('/dev/null').read() -> MemoryError

Mon Sep 27 22:21:03 CEST 2004

On Sep 27, 2004, at 4:05 PM, Armin Rigo wrote:

> On my system, which is admittedly an old Linux box (2.2 kernel), one 
> test
> fails:
>
>>>> file('/dev/null').read()
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> MemoryError
>
> This is because:
>
>>>> os.stat('/dev/null').st_size
> 4540321280L
>
> This looks very broken indeed.  I have no idea where this number comes 
> from.
> I'd also complain if I was asked to allocate a buffer large enough to 
> hold
> that many bytes.  If we cared, we could "enhance" the file.read() 
> method to
> account for the possibility that maybe stat() lied; maybe it is 
> desirable,
> instead of allocating huge amounts of memory, to revert to something 
> like the
> following above some large threshold:
>
> result = []
> while 1:
>   buf = f.read(16384)
>   if not buf:
>     return ''.join(result)
>   result.append(buf)
>
> Of course for genuinely large reads it's a disaster to have to 
> allocate twice
> as much memory.  Anyway I'm not sure we care about going around broken
> behaviour.  I'm just wondering if os.stat() could lie in other 
> situations too.

file(path).read() is never really a good idea in the general case - 
especially for a device node.  It might never terminate and it will get 
a MemoryError for genuinely large files anyway, especially on 32-bit 
architectures.  People should be reading files in chunks or using mmap. 
  Is there really anything the runtime can or should do about this?

In other words, it sounds like the test should be fixed, not the 
implementation.

-bob