ctypes, memory mapped files and context manager

Hans-Peter Jansen hpj at urpla.net
Tue Dec 27 08:37:45 EST 2016


Hi,

I'm using $subjects combination successfully in a project for 
creating/iterating over huge binary files (> 5GB) with impressive performance, 
while resource usage keeps pretty low, all with plain Python3 code. Nice!

Environment: (Python 3.4.5, Linux 4.8.14, openSUSE/x86_64, NFS4 and XFS 
filesystems)

The idea is: map a ctypes structure onto the file at a certain offset, act on 
the structure, and release the mapping. The latter is necessary for keeping 
the mmap file properly resizable and closable (due to the nature of mmaps and 
Python's posix implementation thereof). Hence, a context manager serves us 
well (in theory). 

Here's some code excerpt:

class cstructmap:
    def __init__(self, cstruct, mm, offset = 0):
        self._cstruct = cstruct
        self._mm = mm
        self._offset = offset
        self._csinst = None

    def __enter__(self):
        # resize the mmap (and backing file), if structure exceeds mmap size
        # mmap size must be aligned to mmap.PAGESIZE
        cssize = ctypes.sizeof(self._cstruct)
        if self._offset + cssize > self._mm.size():
            newsize = align(self._offset + cssize, mmap.PAGESIZE)
            self._mm.resize(newsize)
        self._csinst = self._cstruct.from_buffer(self._mm, self._offset)
        return self._csinst

    def __exit__(self, exc_type, exc_value, exc_traceback):
        # free all references into mmap
        del self._csinst
        self._csinst = None


def work():
    with cstructmap(ItemHeader, self._mm, self._offset) as ih:
        ih.identifier = ItemHeader.Identifier
        ih.length = ItemHeaderSize + datasize
        
    blktype = ctypes.c_char * datasize
    with cstructmap(blktype, self._mm, self._offset) as blk:
        blk.raw = data


In practice, this results in:

Traceback (most recent call last):
  File "ctypes_mmap_ctx.py", line 146, in <module>
    mf.add_data(data)
  File "ctypes_mmap_ctx.py", line 113, in add_data
    with cstructmap(blktype, self._mm, self._offset) as blk:
  File "ctypes_mmap_ctx.py", line 42, in __enter__
    self._mm.resize(newsize)
BufferError: mmap can't resize with extant buffers exported.

The issue: when creating a mapping via context manager, we assign a local 
variable (with ..), that keep existing in the local context, even when the 
manager context was left. This keeps a reference on the ctypes mapped area 
alive, even if we try everything to destroy it in __exit__. We have to del the 
with var manually.

Now, I want to get rid of the ugly any error prone del statements.

What is needed, is a ctypes operation, that removes the mapping actively, and 
that could be added to the __exit__ part of the context manager.

Full working code example: 
https://gist.github.com/frispete/97c27e24a0aae1bcaf1375e2e463d239

The script creates a memory mapped file in the current directory named 
"mapfile". When started without arguments, it copies itself into this file, 
until 10 * mmap.PAGESIZE growth is reached (or it errored out before..).

IF you change NOPROB to True, it will actively destruct the context manager 
vars, and should work as advertized.

Any ideas are much appreciated.

Thanks in advance,
Pete




More information about the Python-list mailing list