[SciPy-dev] scipy.stats._chk_asarray

Wed Jun 3 00:07:16 EDT 2009

On Tue, Jun 2, 2009 at 11:40 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Tue, Jun 2, 2009 at 9:09 PM, <josef.pktd at gmail.com> wrote:
>>
>> On Tue, Jun 2, 2009 at 5:58 PM, Robert Kern <robert.kern at gmail.com> wrote:
>> > On Tue, Jun 2, 2009 at 16:20, ctw <lists.20.chth at xoxy.net> wrote:
>> >>> Please revert that.
>> >>
>> >> Done! Sorry about that. I am having some issues with the current
>> >> behavior of changing all inputs to ndarrays. Would it be possible to
>> >> add a nanmean function to numpy that behaves just as np.nansum in the
>> >> sense that it preserves the type of the input?
>> >
>> > I would prefer a comprehensive approach rather than hacking in the one
>> > function you want. I wouldn't be opposed to more NaN-aware functions
>> > in numpy if they were corralled into their own module. However, that
>> > leaves all of the rest of scipy.stats untouched.
>> >
>> > Alternately, you could help write a decorator that would wrap a
>> > function to cast its arguments to ndarrays (bonus points: any
>> > specified subclass) and then cast the result(s) back to the
>> > appropriate subclass determined by the inputs' classes according to
>> > the ufunc rules. You just have to be careful to deal with functions
>> > that take multiple arraylike and non-arraylike inputs and return
>> > multiple outputs (some of which aren't arraylike, either). This would
>> > take some care, but would be a great asset to numpy.
>> >
>>
>> I tried to see if I can introduce a second version _check_asanyarray,
>> that doesn't convert to basic np.array, but I didn't get very far.
>> nanmedian, and nanstd are not easy to convert to work with matrices,
>> nanstd uses multiplication and nanmedian uses np.compress
>>
>> I usually avoid matrices because it is too confusing in numpy to keep
>> track of the type for the basic operations.
>>
>> As an alternative, I looked at np.core.fromnumeric._wrapit, which is
>> the wrapper for np.mean
>>
>> Doing a variation on it seems to work for matrices, see below. I
>> haven't tried it on other array types. This is just a trial balloon to
>> see whether this would make sense for some of the stats functions. It
>> would be relevant mostly for the descriptive statistics, the
>> statistical tests just return test statistics and pvalues, the plan
>> for models is that they get explicit array subclass handling.
>>
>> Is this a good idea to try to work this way?
>> And what is the best way to check whether an array is a plain ndarray
>> and not a subclass instance?
>> something like this ?
>> >>> isinstance(np.matrix(range(4)),np.ndarray)
>> True
>> >>> np.matrix(range(4)).__class__ is np.ndarray
>> False
>> >>> np.arange(5).__class__ is np.ndarray
>> True
>
> The linear algebra routines do a lot of that _wrapit stuff so that they can
> handle both ndarrays and matrices. They might be useful examples.
>

Thanks for the pointer, this looks simple enough for me. Do you also
have some easily readable examples for the usage of arraypriority, for
the multi input case?
(Even though in scipy.stats most multi input cases will be of the same type.)

Josef

> Chuck