[SciPy-dev] State of stats modules?
Gary Strangman
strang at nmr.mgh.harvard.edu
Mon Nov 5 13:15:09 EST 2001
> Most of the stats module was written by Gary Strangman
> (http://www.nmr.mgh.harvard.edu/Neural_Systems_Group/gary/). As far as I
> know, it's the most full featured stats module around, and has very good
> doc-strings. Gary developed it for his own research, and, as such, it is
> somewhat specialized to his field. Still, it is very usable, and, at least
> for the functions I have used, reliable.
Specialized it is, and I have variable confidence in the various functions
(some are much more used--read: better tested--than others).
> Gary's work was/is an excellent starting point for SciPy's statistics
> capabilities. Most of the work needed is actually trimming out extra
> functionality not needed or duplicated, adding unit test functions, and
> assuring that functions behave similarly in calling convention to other
> Numeric/SciPy functions. The new_stats.py module is the beginnings of this
> effort, but it hasn't had any attention in a while. There are also the
> beginnings of some unit testing in the stats/tests directory. Hopefully a
> full compliment of unit tests will develop so there are fewer questions
> about result vailidity.
This would be outstanding ... particularly the unit testing. I've done
some, but way too little.
> aanova and collapse:
>
> I haven't used these, and don't know much about them. I'll forward this to
> Gary and see if he has any comments.
aanova() was a simple analysis of variance function, commonly
used in behavioral-type research but broadly applicable. It was written
when I was learning about anovas in grad school, and hence is poorly
written, poorly tested, and non-optimized. (It worked for the stuff I
needed, when I needed it, but my I have pulled the function from more
recent versions of my module out of my own concerns about its adequacy and
hence utility.)
collapse() is a generic function to collapse over rows of a data file. It
finds unique combinations of values in the columns specified by keepcols
and for each such unique combination it calculates a collapse-function
(mean, sterr, user-defined) for each column specified in collapsecols.
> The stats module deserves some attention, but isn't receiving any right now.
> Any takers?
More recent versions of pstat.py and stats.py (at least more recent than
the def's that were quoted) can be found on my web site
http://www.nmr.mgh.harvard.edu/nsg/strang/python.html
but sadly those are modified only very slowly and irregularly at best.
"Takers" are welcome. :-)
Gary
--------------------------------------------------------------
Gary Strangman, PhD | Neural Systems Group
Office: 617-724-0662 | Massachusetts General Hospital
Fax: 617-726-4078 | 13th Street, Bldg 149, Room 9103
strang at nmr.mgh.harvard.edu | Charlestown, MA 02129
http://www.nmr.mgh.harvard.edu/Neural_Systems_Group/gary/
More information about the SciPy-Dev
mailing list