using mmap on large (> 2 Gig) files

Paul Rubin http
Thu Oct 26 03:35:38 EDT 2006


Chetan <pandyacus.xspam at xspam.sbcglobal.net> writes:
> > Why on would you think that?!  It is counterintuitive.  fseek beyond
> > whatever is buffered in stdio (usually no more than 1kbyte or so)
> > requires a system call, while mmap is just a memory access.
> And the buffer copy required with every I/O from/to the application. 

Even that can probably be avoided since the mmap region has to start
on a page boundary, but anyway regular I/O definitely has to copy the
data.  For mmap, I'm thinking mostly of the case where the entire file
is paged in through most of the program's execution though.  That
obviously wouldn't apply to every application.

> > IMO it should have some kind of IPC locking mechanism added, in
> > addition to the offset stuff suggested.
> The type of IPC required differs depending on who is using the
> shared region - either another python process or another external
> program. Apart from the spinlock primitives, other types of
> synchronization mechanisms are provided by the OS. However, I do see
> value in providing a shared memory based spinlock mechanism. 

I mean just have an interface to OS locks (Linux futex and whatever
the Windows counterpart is) and maybe also a utility function to do a
compare-and-swap in user space.



More information about the Python-list mailing list