[Python-Dev] Modify PyMem_Malloc to use pymalloc for performance

M.-A. Lemburg mal at egenix.com
Fri Feb 12 08:31:15 EST 2016


On 12.02.2016 12:18, Victor Stinner wrote:
> ping?

Sorry, your email must gotten lost in my inbox.

> 2016-02-08 15:18 GMT+01:00 Victor Stinner <victor.stinner at gmail.com>:
>> 2016-02-04 15:05 GMT+01:00 M.-A. Lemburg <mal at egenix.com>:
>>> Sometimes, yes, but we also do allocations for e.g.
>>> parsing values in Python argument tuples (e.g. using
>>> "es" or "et"):
>>>
>>> https://docs.python.org/3.6/c-api/arg.html
>>>
>>> We do document to use PyMem_Free() on those; not sure whether
>>> everyone does this though.
>>
>> It's well documented. If programs start to crash, they must be fixed.
>>
>> I don't propose to "break the API" for free, but to get a speedup on
>> the overall Python.
>>
>> And I don't think that we can say that it's an API change, since we
>> already stated that PyMem_Free() must be used.
>>
>> If your program has bugs, you can use a debug build of Python 3.5 to
>> detect misusage of the API.

Yes, but people don't necessarily do this, e.g. I have
for a very long time ignored debug builds completely
and when I started to try them, I found that some of the
things I had been doing with e.g. free list implementations
did not work in debug builds.

>>> The Python test suite doesn't test Python C extensions,
>>> so it's not surprising that it passes :-)
>>
>> What do you mean by "C extensions"? Which modules?
>>
>> Many modules in the stdlib have "C accelerators" and the PEP 399 now
>> *require* to test the C and Python implementations.

Yes, but those are part of the stdlib. You'd need to check
a few C extensions which are not tested as part of the stdlib,
e.g. numpy, scipy, lxml, pillow, etc. (esp. ones which implement custom
types in C since these will often need the memory management
APIs).

It may also be a good idea to check wrapper generators such
as cython, swig, cffi, etc.

>>>> Instead of teaching developers that well, in fact, PyObject_Malloc()
>>>> is unrelated to object programming, I think that it's simpler to
>>>> modify PyMem_Malloc() to reuse pymalloc ;-)
>>>
>>> Perhaps if you add some guards somewhere :-)
>>
>> We have runtime checks but only implemented in debug mode for efficiency.
>>
>> By the way, I proposed once to add an environment variable to allow to
>> enable these checks without having to recompile Python.  Since the PEP
>> 445, it became easy to implement this. What do you think?
>> https://www.python.org/dev/peps/pep-0445/#add-a-new-pydebugmalloc-environment-variable
>>
>> "This alternative was rejected because a new environment variable
>> would make Python initialization even more complex. PEP 432 tries to
>> simplify the CPython startup sequence."
>>
>> The PEP 432 looks stuck, so I don't think that we should block
>> enhancements because of this PEP. Anyway, my idea should be easy to
>> implement.

I suppose such a flag would create a noticeable runtime
performance hit, since the compiler would no longer be
able to inline the PyMem_*() APIs if you redirect those
APIs to other sets at runtime.

I also don't see much point in carrying around such
baggage in production builds of Python, since you'd most
likely only want to use the tools to debug C extensions during
their development.

>>> Seriously, this may work if C extensions use the APIs
>>> consistently, but in order to tell, we'd need to check
>>> few.
>>
>> Can you suggest me names of projects that must be tested?

See above for a list of starters :-)

It would be good to add a few more that work on text or
larger chunks of memory, since those will most likely utilize
the memory allocators more than other extensions which mostly
wrap (sets of) C variables.

Some of them may also have benchmarks, so in addition to
checking whether they work with the change, you could also
test performance.

>>> I guess the main question then is whether pymalloc is good enough
>>> for general memory allocation needs; and the answer may well be
>>> "yes".
>>
>> What do you mean by "good enough"? For the runtime performance,
>> pymalloc looks to be faster than malloc(). What are your other
>> criterias? Memory fragmentation?

Runtime performance, difference in memory consumption (arenas
cannot be freed if there are still small chunks allocated),
memory locality. I'm no expert in this, so can't really
comment much.

I suspect that lib C and OS provided allocators will have
advantages as well, but since pymalloc redirects to them for
all larger memory chunks, it's probably an overall win for
Python C extensions (and Python itself).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Feb 12 2016)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...           http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...           http://zope.egenix.com/
________________________________________________________________________
2016-01-19: Released eGenix pyOpenSSL 0.13.13 ... http://egenix.com/go86

::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/
                      http://www.malemburg.com/



More information about the Python-Dev mailing list