[Numpy-discussion] Warnings in numpy.ma.test()

Wed Mar 17 19:26:58 EDT 2010

On Wed, Mar 17, 2010 at 5:43 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> On Wed, Mar 17, 2010 at 3:13 PM, Darren Dale <dsdale24 at gmail.com> wrote:
>> On Wed, Mar 17, 2010 at 4:48 PM, Pierre GM <pgmdevlist at gmail.com> wrote:
>> > On Mar 17, 2010, at 8:19 AM, Darren Dale wrote:
>> >>
>> >> I started thinking about a third method called __input_prepare__ that
>> >> would be called on the way into the ufunc, which would allow you to
>> >> intercept the input and pass a somehow modified copy back to the
>> >> ufunc. The total flow would be:
>> >>
>> >> 1) Call myufunc(x, y[, z])
>> >> 2) myufunc calls ?.__input_prepare__(myufunc, x, y), which returns x',
>> >> y' (or simply passes through x,y by default)
>> >> 3) myufunc creates the output array z (if not specified) and calls
>> >> ?.__array_prepare__(z, (myufunc, x, y, ...))
>> >> 4) myufunc finally gets around to performing the calculation
>> >> 5) myufunc calls ?.__array_wrap__(z, (myufunc, x, y, ...)) and returns
>> >> the result to the caller
>> >>
>> >> Is this general enough for your use case? I haven't tried to think
>> >> about how to change some global state at one point and change it back
>> >> at another, that seems like a bad idea and difficult to support.
>> >
>> >
>> > Sounds like a good plan. If we could find a way to merge the first two
>> > (__input_prepare__ and __array_prepare__), that'd be ideal.
>>
>> I think it is better to keep them separate, so we don't have one
>> method that is trying to do too much. It would be easier to explain in
>> the documentation.
>>
>> I may not have much time to look into this until after Monday. Is
>> there a deadline we need to consider?
>>
>
> I don't think this should go into 2.0, I think it needs more thought.

Now that you mention it, I agree that it would be too rushed to try to
get it in for 2.0. Concerning a later release, is there anything in
particular that you think needs to be clarified or reconsidered?

> And
> 2.0 already has significant code churn. Is there any reason beyond a big
> hassle not to set/restore the error state around all the ufunc calls in ma?
> Beyond that, the PEP that you pointed to looks interesting. Maybe some sort
> of decorator around ufunc calls could also be made to work.

I think the PEP is interesting, but it is languishing. There were some
questions and criticisms on the mailing list that I do not think were
satisfactorily addressed, and as far as I know the author of the PEP
has not pursued the matter further. There was some interest on the
python-dev mailing list in the numpy community's use case, but I think
we need to consider what can be done now to meet the needs of ndarray
subclasses. I don't see PEP 3124 happening in the near future.

What I am proposing is a simple extension to our existing framework to
let subclasses hook into ufuncs and customize their behavior based on
the context of the operation (using the __array_priority__ of the
inputs and/or outputs, and the identity of the ufunc). The steps I
listed allow customization at the critical steps: prepare the input,
prepare the output, populate the output (currently no proposal for
customization here), and finalize the output. The only additional step
proposed is to prepare the input.

In the long run, we could consider if ufuncs should be instances of a
class, perhaps implemented in Cython. This way the ufunc will be able
to pass itself to the special array methods as part of the context
tuple, as is currently done. Maybe an alternative approach would be
for ufuncs to provide methods where subclasses could register routines
for the various steps I specified based on the types of the inputs,
similar to the PEP. This way, the ufunc would determine the context
based on the input (rather than the current way of the ufunc
determining part of the context based on the input by inspecting
__array_priority__ and then the input with highest priority
determining the context based on the identity of the ufunc and the
rest of the input.) This new (half baked) approach could be
backward-compatible with the old one: if the combination of inputs
isn't found in the registry, it would fall back on the existing
input-/array_prepare array_wrap mechanisms (which in principle could
then be deprecated, and at that point __array_priority__ might no
longer be necessary). I don't see anything to indicate that we would
regret implementing a special __input_prepare__ method down the road.

Darren