[SciPy-Dev] SciPy Goal

Wed Jan 4 22:29:59 EST 2012

On Jan 4, 2012, at 8:22 PM, Fernando Perez wrote:

> Hi all,
> 
> On Wed, Jan 4, 2012 at 5:43 PM, Travis Oliphant <travis at continuum.io> wrote:
>> What do others think is missing?  Off the top of my head:   basic wavelets
>> (dwt primarily) and more complete interpolation strategies (I'd like to
>> finish the basic interpolation approaches I started a while ago).
>> Originally, I used GAMS as an "overview" of the kinds of things needed in
>> SciPy.   Are there other relevant taxonomies these days?
> 
> Well, probably not something that fits these ideas for scipy
> one-to-one, but the Berkeley 'thirteen dwarves' list from the 'View
> from Berkeley' paper on parallel computing is not a bad starting
> point; summarized here they are:
> 
>    Dense Linear Algebra
>    Sparse Linear Algebra [1]
>    Spectral Methods
>    N-Body Methods
>    Structured Grids
>    Unstructured Grids
>    MapReduce
>    Combinational Logic
>    Graph Traversal
>    Dynamic Programming
>    Backtrack and Branch-and-Bound
>    Graphical Models
>    Finite State Machines

This is a nice list, thanks!

> 
> Descriptions of each can be found here:
> http://view.eecs.berkeley.edu/wiki/Dwarf_Mine and the full study is
> here:
> 
> http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.html
> 
> That list is biased towards the classes of codes used in
> supercomputing environments, and some of the topics are probably
> beyond the scope of scipy (say structured/unstructured grids, at least
> for now).
> 
> But it can be a decent guiding outline to reason about what are the
> 'big areas' of scientific computing, so that scipy at least provides
> building blocks that would be useful in these directions.
> 

Thanks for the links.

> One area that hasn't been directly mentioned too much is the situation
> with statistical tools.  On the one hand, we have the phenomenal work
> of pandas, statsmodels and sklearn, which together are helping turn
> python into a great tool for statistical data analysis (understood in
> a broad sense).  But it would probably be valuable to have enough of a
> statistical base directly in numpy/scipy so that the 'out of the box'
> experience for statistical work is improved.  I know we have
> scipy.stats, but it seems like it needs some love.

It seems like scipy stats has received quite a bit of attention.   There is always more to do, of course, but I'm not sure what specifically you think is missing or needs work.    A big question to me is the impact of data-frames as the underlying data-representation of the algorithms and the relationship between the data-frame and a NumPy array. 

-Travis

> 
> Cheers,
> 
> f
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev