[SciPy-dev] State of stats modules?

Magnus Lyckå magnus at thinkware.se
Sat Nov 3 09:47:55 EST 2001


Hi, I'm a new scipy user, and I'm still a little lost...
I've looked through the mailing list archeives and
the web site, but I guess I might have missed some info
I should have spotted... Anyway, I'll ask some questions
here... I start with this one:

What's the state of the stats module?

Just at random, I tried the aanova function and get:

Traceback (most recent call last):
   File "<interactive input>", line 1, in ?
   File "G:\Python21\scipy\stats\stats.py", line 3933, in aanova
     M = pstat.collapse(data,Bscols,-1,None,None,mean)
TypeError: collapse() takes at most 5 arguments (6 given)

collapse wants:
def collapse (listoflists,keepcols,collapsecols,sterr=None,ns=None):

In other words... No mean...

Removing the last parameter in lines 3933, 3988 and 4372 I get
a result, but I won't hang my neck out and say that it's correct
(or not).

Also, what happens is that the aanova function uses "print" to
display the result. Fine for interactive use if I just want to
see how things stand, but less amusing if I want to use the
results from the anova programmatically in further computations,
or for plotting or what ever. The return value from the function
is None. :-(

There is obviously code in ststs.py that assumes that collapse can
work with other stuff than means, as it seems to do now:

collapsed = pstat.collapse(M,btwcols,-1,None,len,mean)
# Obviously needed for-loop to get source cell-means embedded in collapse fcns
contrastmns = pstat.collapse(collapsed,btwsourcecols,-2,sterr,len,mean)
# Collapse again, this time SUMMING instead of averaging (to get cell Ns)
contrastns = pstat.collapse(collapsed,btwsourcecols,-1,None,None,
                             sum)
# Collapse again, this time calculating harmonicmeans (for hns)
contrasthns = pstat.collapse(collapsed,btwsourcecols,-1,None,None,
                              harmonicmean)

I assume that it's the following snippet in pstat.collapse that should
use the sixth parameter:

     if keepcols == []:
         means = [0]*len(collapsecols)
         for i in range(len(collapsecols)):
             avgcol = colex(listoflists,collapsecols[i])
             means[i] = stats.mean(avgcol) <=== supplied function here?
         return means

But I don't feel that I understand enough of statistics or the plan here to
patch things on my own...

Anyone caring for this?


--
Magnus Lyckå, Thinkware AB
Älvans väg 99, SE-907 50 UMEÅ
tel 070-582 80 65, fax: 070-612 80 65
http://www.thinkware.se/  mailto:magnus at thinkware.se




More information about the SciPy-Dev mailing list