Loading contents behind the scenes

Thu May 22 09:51:35 EDT 2008

On 2008-05-22, s0suk3 at gmail.com <s0suk3 at gmail.com> wrote:
> Hi, I wanted to know how cautious it is to do something like:
>
> f = file("filename", "rb")
> f.read()
>
> for a possibly huge file. When calling f.read(), and not doing
> anything with the return value, what is Python doing internally? Is it
> loading the content of the file into memory (regardless of whether it
> is discarding it immediately)?

I am not a Python interpreter developer, but as user, yes I'd expect that to
happen. The method doesn't know you are not doing anything with its return
value.

> In my case, what I'm doing is sending the return value through a
> socket:
>
> sock.send(f.read())
>
> Is that gonna make a difference (memory-wise)? I guess I'm just
> concerned with whether I can do a file.read() for any file in the
> system in an efficient and memory-kind way, and with low overhead in
> general. (For one thing, I'm not loading the contents into a
> variable.)

Doesn't matter. You allocate a string in which the contents is loaded (the
return value of 'f.read()', and you hand over (a reference to) that string to
the 'send()' method.

Note that memory is allocated by data *values*, not by *variables* in Python
(they are merely references to values).

> Not that I'm saying that loading a huge file into memory will horribly
> crash the system, but it's good to try to program in the safest way
> possibly. For example, if you try something like this in the

Depends on your system, and your biggest file.

At a 32 bit platform, anything bigger than about 4GB (usually already at around
3GB) will crash the program for the simple reason that you are running out of
address space to store bytes in.

To fix, read and write blocks by specifying a block-size in the 'read()' call.

Sincerely,
Albert