[Numpy-discussion] Proposed change from POSIX to PyMem_RawXXX (plain text resend)

Sebastian Berg sebastian at sipsolutions.net
Thu Jul 22 13:04:58 EDT 2021


On Thu, 2021-07-22 at 16:48 +0000, Daniel Waddington wrote:
> Hi,
> I'm working with Numpy in the context of supporting different memory
> types such as persistent memory and CXL attached.  I would like to
> propose a minor change, but figured I would get some initial feedback
> from the developer community before submitting a PR.

Hi Daniel,

you may want to have a look at Matti's NEP to allow custom allocation
strategies:

    https://numpy.org/neps/nep-0049.html

When implemented, this will allow to explicitly modify the behaviour
here (which means you could make it use the Python version).

In principle, once that work is done, we could also use the Python
allocator as you are proposing.  It may be a follow-up discussion.


The difficulty is that the NumPy ABI is fully open:

1. A user can create an array with data they allocated
2. In theory, a user could `realloc` or even replace an arrays `data`

In practice, hopefully nobody does the second one, but we can't be
sure.

The first means we have to wait for the NEP, because it will allow us
to work around the problem: We can use different `free`/`realloc` if a
user provided the data.
The second means that we have to be careful when consider changing the
default even after implementing the NEP.  But it may be possible, at
least if we do it slowly/gently.

Cheers,

Sebastian


>  
> In multiarray/alloc.c the allocator (beneath the cache) using the
> POSIX malloc/calloc/realloc/free.  I propose that these should be
> changed to PyMem_RawXXX equivalents.  The reason for this is that by
> doing so, one can use the python custom allocator functions (e.g.
> PyMem_GetAllocator/PyMem_SetAllocator) to intercept the memory
> allocator for NumPy arrays.  This will be useful as heterogeneous
> memories need supporting. I don't think this will drastically change
> performance but it is an extra function redirection (and it will only
> impact when the cache can't deliver).
>  
> There are likely other places in NumPy that could do with a rinse and
> repeat - may be someone could advise?
>  
> Thanks,
> Daniel Waddington
> IBM Research
>  
> ---
> Example patch for 1.19.x (I'm building with Python3.6)
>  
> diff --git a/numpy/core/src/multiarray/alloc.c
> b/numpy/core/src/multiarray/alloc.c
> index 795fc7315..e9e888478 100644
> --- a/numpy/core/src/multiarray/alloc.c
> +++ b/numpy/core/src/multiarray/alloc.c
> @@ -248,7 +248,7 @@ PyDataMem_NEW(size_t size)
>      void *result;
>  
>      assert(size != 0);
> -    result = malloc(size);
> +    result = PyMem_RawMalloc(size);
>      if (_PyDataMem_eventhook != NULL) {
>          NPY_ALLOW_C_API_DEF
>          NPY_ALLOW_C_API
> @@ -270,7 +270,7 @@ PyDataMem_NEW_ZEROED(size_t size, size_t elsize)
>  {
>      void *result;
>  
> -    result = calloc(size, elsize);
> +    result = PyMem_RawCalloc(size, elsize);
>      if (_PyDataMem_eventhook != NULL) {
>          NPY_ALLOW_C_API_DEF
>          NPY_ALLOW_C_API
> @@ -291,7 +291,7 @@ NPY_NO_EXPORT void
>  PyDataMem_FREE(void *ptr)
>  {
>      PyTraceMalloc_Untrack(NPY_TRACE_DOMAIN, (npy_uintp)ptr);
> -    free(ptr);
> +    PyMem_RawFree(ptr);
>  
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20210722/2f8cf2f2/attachment.sig>


More information about the NumPy-Discussion mailing list