[Tutor] If you don't close file when writing, do bytes stay in memory?

Kent Johnson kent3737 at gmail.com
Sat Oct 10 17:26:48 CEST 2009


2009/10/10 Xbox Muncher <xboxmuncher at gmail.com>:
> What does flush do technically?
> "Flush the internal buffer, like stdio‘s fflush(). This may be a no-op on some file-like objects."
>
> The reason I thought that closing the file after I've written about 500MB file data to it, was smart -> was because I thought that python stores that data in memory or keeps info about it somehow and only deletes this memory of it when I close the file.
> When I write to a file in 'wb' mode at 500 bytes at a time.. I see that the file size changes as I continue to add more data, maybe not in exact 500 byte sequences as my code logic but it becomes bigger as I make more iterations still.
>
> Seeing this, I know that the data is definitely being written pretty immediately to the file and not being held in memory for very long. Or is it...? Does it still keep it in this "internal buffer" if I don't close the file. If it does, then flush() is exactly what I need to free the internal buffer, which is what I was trying to do when I closed the file anyways...
>
> However, from your replies I take it that python doesn't store this data in an internal buffer and DOES immediately dispose of the data into the file itself (of course it still exists in variables I put it in). So, closing the file doesn't free up any more memory.

Python file I/O is buffered. That means that there is a memory buffer
that is used to hold a small amount of the file as it is read or
written.

You original example writes 5 bytes at a time. With unbuffered I/O,
this would write to the disk on every call to write(). (The OS also
has some buffering, I'm ignoring that.)

With buffered writes, there is a memory buffer allocated to hold the
data. The write() call just puts data into the buffer; when it is
full, the buffer is written to the disk. This is a flush. Calling
flush() forces the buffer to be written.

So, a few points about your questions:
- calling flush() after each write() will cause a disk write. This is
probably not what you want, it will slow down the output considerably.
- calling flush() does not de-allocate the buffer, it just writes its
contents. So calling flush() should not change the amount of memory
used.
- the buffer is pretty small, maybe 8K or 32K. You can specify the
buffer size as an argument to open() but really you probably want the
system default.

Kent


More information about the Tutor mailing list