[SciPy-dev] RFR: Proposed fixes in scipy.stats functions for calculation of variance/error/etc.

josef.pktd at gmail.com josef.pktd at gmail.com
Mon Oct 26 01:51:44 EDT 2009


On Mon, Oct 26, 2009 at 1:31 AM, Ariel Rokem <arokem at berkeley.edu> wrote:
> Hi Josef -
>
>>
>> >From looking at the three function, I would assume that the combined
>> function would have a signature like
>>
>> def zscore(a, compare=None, axis=0, ddof=0)
>>
>> or two functions, one with compare, one without ?
>
> Yes - I think that would be best. After all, someone wrote zmap with
> some usecase in mind (I assume), so we would still want that
> functionality to live on explicitly. So, I suggest (see attached diff)
> to have two functions: one will be zscore and the other would be
> zscore_compare. In the attached diff, I have decorated all these
> functions with a deprecation warning and added these two new
> functions, zscore (with the new, by-axis behavior. This makes more
> sense to me, somehow) and zscore_compare.
>
>>
>>
>> About default axis=0:
>>
> ...
>
> Thanks for the explanation and for digging into the history of this. I
> still think that in the long run it would be preferable to have these
> things be internally consistent (that is consistent between numpy and
> scipy), rather than consistent with other tools.
>
> Finally - I have tried to combine sem and stderr into one function,
> under sem. Notice in particular the correction for ddof. My
> understanding is that this should produce per default the result
> std/sqrt(n-1), which is what we usually want for the sem. Is that
> correct?


Yes, I had to check the ttests, that's when I spend more time checking the
degrees of freedom. It looks like the denominator needs one "n" and one
"n-1"

 v = np.var(a, axis, ddof=1)
 t = d / np.sqrt(v/float(n))

sem(a, ddof=1, axis=0) should have ddof as last argument to match np.var.

your axis handling is still incorrect in zscore for 2d arrays

if axis=1 then we need to add an axis
a.mean(1)[:,None]

there is a function in numpy to do this, expand_axis (?) that
works for general axis. There was also a recent discussion
on the numpy list for getting the axis back after a reduce.

Josef



>
>  Cheers,
>
> Ariel
>
> _______________________________________________
> Scipy-dev mailing list
> Scipy-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>
>



More information about the SciPy-Dev mailing list