[Numpy-discussion] draft NEP for breaking ufunc ABI in a controlled way

Antoine Pitrou solipsis at pitrou.net
Thu Sep 24 05:34:19 EDT 2015


On Thu, 24 Sep 2015 00:20:23 -0700
Nathaniel Smith <njs at pobox.com> wrote:
> > int PyUFunc_Identity(PyFuncObject *)
> >
> >   Replaces ufunc->identity.
> 
> Hmm, I can imagine cases where we might want to change how this works.
> (E.g. if np.dot were a ufunc then the existing identity settings
> wouldn't work very well... and I have some vague memory that there
> might already some delicate code in a few places because of
> difficulties in defining "zero" and "one" for arbitrary dtypes.)

Yes... As long as there is a way for us to set the identity value
(whatever the exact API) when constructing a ufunc, it should be ok.

> I assume the 'i' part isn't actually interesting here (since there's
> no longer any parallel vector of function pointers accessible), and
> the high-level semantics that you're looking for are "please give me
> the set of signatures that have a loop defined"?

Indeed.

> [Edit: Also, see the discussion below about integer type pointers. The
> consequences here are that we can certainly provide an operation like
> this, but if we do then we might be abandoning it in a few releases
> (e.g. it might start telling you about only a subset of defined
> signatures). So can you expand a bit on what you mean by "would be
> nice" above?]

"Would be nice" really means "we could make use of it" for letting the
user access ufunc metadata. We don't *need* it currently. But
generally being able to query the high-level properties of a ufunc,
from C, sounds like a good thing, and perhaps other people would be
interested.

> > PyObject *PyUFunc_SetObject(PyUFuncObject *, PyObject *)
> >
> >   Sets the ufunc's "object" to the given object.  The object has no
> >   special semantics except that it is DECREF'ed when the ufunc is
> >   deallocated (this is today's ufunc->obj).  The DECREF should happen
> >   only after the ufunc has accessed any internal resources (since the
> >   DECREF could deallocate some of those resources).
> 
> I understand why you need a "base" object like this for individual
> loops, but if ufuncs start managing the ufunc-level memory buffers
> internally, then is this still useful? I guess I'm curious to see an
> example.

Well, for example, we dynamically allocate the ufunc's name (and
possibly its docstring), so we need to deallocate it when the ufunc is
destroyed.  Actually, we should probably deallocate more stuff that we
currently don't (such as the execution environment)...

> > PyObject *PyUFunc_GetObject(PyUFuncObject *)
> >
> >   Return the ufunc's current "object".
> 
> Oh, are you planning to actually use this to attach some arbitrary
> metadata, not just attach deallocation callbacks?

No, just deallocation callbacks. I was including the GetObject function
for completeness, I'm not sure we would need it (but it sounds trivial
to provide and maintain).

> Hmm, that's an interesting and tricky point, actually -- I think the
> way it will work eventually is that signatures will be specified in
> terms of "dtypetypes" (i.e., subclasses of dtype, rather than ints
> *or* instances of dtype = PyArray_Descrs).

Subclasses? I'm not sure what you mean by that, how would one specify
e.g. an int64 vs. an int32?

Are you referring to Travis' dtypes-as-classes project, or something
similar? In that case though, a dtype would still be an instance of a
"dtypetype" (metatype), not a subclass :-)

> But I guess that's just a
> challenge we'll have to think about when implementing this stuff --
> either it means that the new ufunc API will have to wait a bit for
> more of the new dtype machinery to be ready, or we'll have to
> temporarily bridge the gap with an loop registration API that takes
> new-style loop callbacks but uses int signatures (and then later turn
> it into a thin wrapper around the final API).

Well, as long as you keep the int typecodes in Numpy (and I guess
you'll do for quite some time, for compatibility), bridging should be
easy indeed.

Regards

Antoine.





More information about the NumPy-Discussion mailing list