mmap caching

Nick Craig-Wood nick at craig-wood.com
Sun Jan 21 15:30:07 EST 2007


George Sakkis <george.sakkis at gmail.com> wrote:
>  I've been trying to track down a memory leak (which I initially
>  attributed erroneously to numpy) and it turns out to be caused by a
>  memory mapped file. It seems that mmap caches without limit the chunks
>  it reads, as the memory usage grows to several hundreds MBs according
>  to the Windows task manager before it dies with a MemoryError. I'm
>  positive that these chunks are not referenced anywhere else; in fact if
>  I change the mmap object to a normal file, memory usage remains
>  constant. The documentation of mmap doesn't mention anything about
>  this. Can the caching strategy be modified at the user level ?

I'm not familiar with mmap() on windows, but assuming it works the
same way as unix...

The point of mmap() is to map files into memory.  It is completely up
to the OS to bring pages into memory for you to read / write to, and
completely up to the OS to get rid of them again.

What you would expect is that the file is demand paged into memory as
you access bits of it.  These pages will remain in memory until the OS
feels some memory pressure when the pages will be written out if dirty
and then dropped.

The OS will try to keep hold of pages as long as possible just in case
you need them again.  The pages dropped should be the least recently
used pages.

I wouldn't have expected a MemoryError though...

Did you do mmap.flush() after writing?

-- 
Nick Craig-Wood <nick at craig-wood.com> -- http://www.craig-wood.com/nick



More information about the Python-list mailing list