[Numpy-discussion] Proposal to accept NEP 49: Data allocation strategies

Eric Wieser wieser.eric+numpy at gmail.com
Thu May 6 07:07:26 EDT 2021


The NEP looks good, but I worry the API isn't flexible enough. My two main
concerns are:

###  Stateful allocators

Consider an allocator that aligns to `N` bytes, where `N` is configurable
from a python call in someone else's extension module. Where do they store
`N`?
They can hide it in `PyDataMem_Handler::name` but that's obviously an abuse
of the API.
They can store it as a global variable, but then obviously the idea of
tracking the allocator used to construct an array doesn't work, as the
state ends up changing with the global allocator.

The easy way out here would be to add a `void* context` field to the
structure, and pass it into all the methods.
This doesn't really solve the problem though, as now there's no way to
cleanup any allocations used to populate `context`, or worse decrement
references to python objects stored within `context`.
I think we want to bundle `PyDataMem_Handler` in a `PyObject` somehow,
either via a new C type, or by using the PyCapsule API which has the
cleanup and state hooks we need.
`PyDataMem_GetHandlerName` would then return this PyObject rather than an
opaque name.

For a more exotic case - consider a file-backed allocator, that is
constructed from a python `mmap` object which manages blocks within that
mmap.
The allocator needs to keep a reference to the `mmap` object alive until
all the arrays allocated within it are gone, but probably shouldn't leak a
reference to it either.

### Thread and async-local allocators

For tracing purposes, I expect it to be valuable to be able to configure
the allocator within a single thread / coroutine.
If we want to support this, we'd most likely want to work with the PEP567
ContextVar API rather than a half-baked thread_local solution that doesn't
work for async code.

This problem isn't as pressing as the statefulness problem.
Fixing it would amount to extending the `PyDataMem_SetHandler` API, and
would be unlikely to break any code written against the current version of
the NEP; meaning it would be fine to leave as a follow-up.
It might still be worth remarking upon as future work of some kind in the
NEP.


Eric

On Thu, 6 May 2021 at 11:41, Matti Picus <matti.picus at gmail.com> wrote:

> Here is the current rendering of the
> NEP:https://numpy.org/neps/nep-0049.html
>
>
>
> The mailing list discussion, started on April 20 did not bring up any
> objections to the proposal, nor were there objections in the discussion
> around the text of the NEP. There were questions around details of the
> implementation, thank you reviewers for carefully looking at them and
> suggesting improvements.
>
>
> If there are no substantive objections within 7 days from this email,
> then the NEP will be accepted; see NEP 0 for more details.
>
>
> Matti
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20210506/71eebf17/attachment.html>


More information about the NumPy-Discussion mailing list