[issue2643] mmap_object_dealloc does not call FlushViewOfFile on windows

Charles-Francois Natali report at bugs.python.org
Tue Apr 6 23:16:50 CEST 2010


Charles-Francois Natali <neologix at free.fr> added the comment:

I don't think that calling msync() or FlushViewOfFile() when closing the mmap object or deallocating it is a good idea.
sync()ing dirtied pages to disk is very expensive, blocks the process for a long time, and the OS does a much better job at it (it can be done asynchronously, sync()ing can be grouped, re-ordered, etc).
For example, it took around 7 seconds to msync() a mmap-filed of 300Mb in a quick test I've done.
Furthermore, we don't do this for regular files: when a file object is closed or deallocated, we don't call fsync(), and the documentation makes it clear:

os.fsync(fd)
Force write of file with filedescriptor fd to disk. On Unix, this calls the native fsync() function; on Windows, the MS _commit() function.

If you’re starting with a Python file object f, first do f.flush(), and then do os.fsync(f.fileno()), to ensure that all internal buffers associated with f are written to disk. Availability: Unix, and Windows starting in 2.2.3.

The reason is the same: fsync(), like msync(), is not usually what you want, because of latencies and performance penalties it incurs.
Any application requiring the data to be actually written to disk _must_ call fsync() for file objects, and call the flush() method of mmap objects (which is done just for that reason).

So, for performance and consistency with files, I'd suggest to remove calls to msync() and FlushViewOfFile() from close() and dealloc().
If agreed, I can submit the patch.

> A test could explicitly close a dirtied mmaped file and then open()
> it to check that everything was written, no?

The problem is that when you open() your file, you'll read the data from cache. You have no way to read the data directly from disk (well, there may be, but are higly non portable, like O_DIRECT file or raw IO).
The only check that can be done is tracing the process and checking that msync() is indeed called.

----------
nosy: +neologix

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue2643>
_______________________________________


More information about the Python-bugs-list mailing list