[Python-Dev] Modify PyMem_Malloc to use pymalloc for performance

Victor Stinner victor.stinner at gmail.com
Fri Apr 22 10:46:12 EDT 2016


Hi,

My pull request has been merged into numpy. numpy now uses
PyMem_RawMalloc() rather than PyMem_Malloc() since it uses the memory
allocator without holding the GIL:
https://github.com/numpy/numpy/pull/7404

It was proposed to modify numpy to hold the GIL. Maybe it will be done later.

It means that there are no more C extensions known to not use
correctly Python memory allocators. So I pushed my change in CPython
to use the pymalloc memory allocator in PyMem_Malloc():
https://hg.python.org/cpython/rev/68b2a43d8653

I documented that porting C extensions to Python 3.6 require to run
tests with PYTHONMALLOC=debug. This environment variable enables
checks at runtime to validate the usage of Python memory allocators,
including checks on the GIL. PYTHONMALLOC=debug and the check on the
GIL are new in Python 3.6.

By the way, I modified the code to log the fatal error. if a buffer
overflow/underflow is detected in a free function like PyObject_Free()
and tracemalloc is enabled, the traceback where the memory block was
allocated is now displayed:
https://docs.python.org/dev/whatsnew/3.6.html#pythonmalloc-environment-variable

Moreover, the warning logger now also log where file, socket, etc.
were allocated on ResourceWarning:
https://docs.python.org/dev/whatsnew/3.6.html#warnings

It looks like Python 3.6 will help developers ;-)

Victor

2016-04-20 1:33 GMT+02:00 Victor Stinner <victor.stinner at gmail.com>:
> Ping? Is someone still opposed to my change #26249 "Change
> PyMem_Malloc to use pymalloc allocator"? If no, I think that I will
> push my change.
>
> My change only changes two lines, so it can be easily reverted before
> CPython 3.6 if we detect major issues in third-party extensions. And
> maybe it's better to push such change today to get more time to play
> with it, than pushing it late in the development of CPython 3.6.
>
> The new PYTHONMALLOC=debug feature allows to quickly and easily check
> the usage of the PyMem_Malloc() API, even if Python is compiled in
> release mode.
>
> I checked multiple Python extensions written in C. I only found one
> bug in numpy and I sent a patch (not merged yet).
>
> victor
>
> 2016-03-15 0:19 GMT+01:00 Victor Stinner <victor.stinner at gmail.com>:
>> 2016-02-12 14:31 GMT+01:00 M.-A. Lemburg <mal at egenix.com>:
>>>>> If your program has bugs, you can use a debug build of Python 3.5 to
>>>>> detect misusage of the API.
>>>
>>> Yes, but people don't necessarily do this, e.g. I have
>>> for a very long time ignored debug builds completely
>>> and when I started to try them, I found that some of the
>>> things I had been doing with e.g. free list implementations
>>> did not work in debug builds.
>>
>> I just added support for debug hooks on Python memory allocators on
>> Python compiled in *release* mode. Set the environment variable
>> PYTHONMALLOC to debug to try with Python 3.6.
>>
>> I added a check on PyObject_Malloc() debug hook to ensure that the
>> function is called with the GIL held. I opened an issue to add a
>> similar check on PyMem_Malloc():
>> https://bugs.python.org/issue26563
>>
>>
>>> Yes, but those are part of the stdlib. You'd need to check
>>> a few C extensions which are not tested as part of the stdlib,
>>> e.g. numpy, scipy, lxml, pillow, etc. (esp. ones which implement custom
>>> types in C since these will often need the memory management
>>> APIs).
>>>
>>> It may also be a good idea to check wrapper generators such
>>> as cython, swig, cffi, etc.
>>
>> I ran the test suite of numpy, lxml, Pillow and cryptography (used cffi).
>>
>> I found a bug in numpy. numpy calls PyMem_Malloc() without holding the GIL:
>> https://github.com/numpy/numpy/pull/7404
>>
>> Except of this bug, all other tests pass with PyMem_Malloc() using
>> pymalloc and all debug checks.
>>
>> Victor


More information about the Python-Dev mailing list