ctypes, memory mapped files and context manager

Peter Otten __peter__ at web.de
Tue Dec 27 15:39:51 EST 2016


Hans-Peter Jansen wrote:

> Hi,
> 
> I'm using $subjects combination successfully in a project for
> creating/iterating over huge binary files (> 5GB) with impressive
> performance, while resource usage keeps pretty low, all with plain Python3
> code. Nice!
> 
> Environment: (Python 3.4.5, Linux 4.8.14, openSUSE/x86_64, NFS4 and XFS
> filesystems)
> 
> The idea is: map a ctypes structure onto the file at a certain offset, act
> on the structure, and release the mapping. The latter is necessary for
> keeping the mmap file properly resizable and closable (due to the nature
> of mmaps and Python's posix implementation thereof). Hence, a context
> manager serves us well (in theory).
> 
> Here's some code excerpt:
> 
> class cstructmap:
>     def __init__(self, cstruct, mm, offset = 0):
>         self._cstruct = cstruct
>         self._mm = mm
>         self._offset = offset
>         self._csinst = None
> 
>     def __enter__(self):
>         # resize the mmap (and backing file), if structure exceeds mmap
>         # size mmap size must be aligned to mmap.PAGESIZE
>         cssize = ctypes.sizeof(self._cstruct)
>         if self._offset + cssize > self._mm.size():
>             newsize = align(self._offset + cssize, mmap.PAGESIZE)
>             self._mm.resize(newsize)
>         self._csinst = self._cstruct.from_buffer(self._mm, self._offset)
>         return self._csinst

Here you give away a reference to the ctypes.BigEndianStructure. That means 
you no longer control the lifetime of self._csinst which in turn holds a 
reference to the underlying mmap or whatever it's called. 

There might be a way to release the mmap reference while the wrapper 
structure is still alive, but the cleaner way is probably to not give it 
away in the first place, and create a proxy instead with

          return weakref.proxy(self._csinst)


> 
>     def __exit__(self, exc_type, exc_value, exc_traceback):
>         # free all references into mmap
>         del self._csinst

The line above is redundant. It removes the attribute from the instance 
__dict__ and implicitly decreases its refcount. It does not actually 
physically delete the referenced object. If you remove the del statement the 
line below will still decrease the refcount. 

Make sure you understand this to avoid littering your code with cargo cult 
del-s ;)

>         self._csinst = None
> 
> 
> def work():
>     with cstructmap(ItemHeader, self._mm, self._offset) as ih:
>         ih.identifier = ItemHeader.Identifier
>         ih.length = ItemHeaderSize + datasize
>         
>     blktype = ctypes.c_char * datasize
>     with cstructmap(blktype, self._mm, self._offset) as blk:
>         blk.raw = data
> 
> 
> In practice, this results in:
> 
> Traceback (most recent call last):
>   File "ctypes_mmap_ctx.py", line 146, in <module>
>     mf.add_data(data)
>   File "ctypes_mmap_ctx.py", line 113, in add_data
>     with cstructmap(blktype, self._mm, self._offset) as blk:
>   File "ctypes_mmap_ctx.py", line 42, in __enter__
>     self._mm.resize(newsize)
> BufferError: mmap can't resize with extant buffers exported.
> 
> The issue: when creating a mapping via context manager, we assign a local
> variable (with ..), that keep existing in the local context, even when the
> manager context was left. This keeps a reference on the ctypes mapped area
> alive, even if we try everything to destroy it in __exit__. We have to del
> the with var manually.
> 
> Now, I want to get rid of the ugly any error prone del statements.
> 
> What is needed, is a ctypes operation, that removes the mapping actively,
> and that could be added to the __exit__ part of the context manager.
> 
> Full working code example:
> https://gist.github.com/frispete/97c27e24a0aae1bcaf1375e2e463d239
> 
> The script creates a memory mapped file in the current directory named
> "mapfile". When started without arguments, it copies itself into this
> file, until 10 * mmap.PAGESIZE growth is reached (or it errored out
> before..).
> 
> IF you change NOPROB to True, it will actively destruct the context
> manager vars, and should work as advertized.
> 
> Any ideas are much appreciated.

You might put some more effort into composing example scripts. Something 
like the script below would have saved me some time...

import ctypes
import mmap

from contextlib import contextmanager

class T(ctypes.Structure):
    _fields = [("foo", ctypes.c_uint32)]


@contextmanager
def map_struct(m, n):
    m.resize(n * mmap.PAGESIZE)
    yield T.from_buffer(m)

SIZE = mmap.PAGESIZE * 2
f = open("tmp.dat", "w+b")
f.write(b"\0" * SIZE)
f.seek(0)
m = mmap.mmap(f.fileno(), mmap.PAGESIZE)

with map_struct(m, 1) as a:
    a.foo = 1
with map_struct(m, 2) as b:
    b.foo = 2


> 
> Thanks in advance,
> Pete





More information about the Python-list mailing list