[SciPy-dev] Another GSoC idea

David Cournapeau cournape at gmail.com
Sat Mar 21 01:50:19 EDT 2009


Hi David,

On Sat, Mar 21, 2009 at 2:01 PM, David Warde-Farley <dwf at cs.toronto.edu> wrote:
> I've been fiddling with ideas for GSoC related to SciPy and I wanted
> to run this by people on the list.
>
> David C. and others are often complaining that C and Fortran code is
> an order of magnitude harder to maintain than Python/Cython code.
> Thus, would there be interest in a proposal that included rewriting
> Damian Eads' excellent scipy.spatial.distance and scipy.cluster.vq in
> Cython?

For scipy.cluster.vq, I already have something in Cython - just not
put into scipy because the code is barely "research quality" (whatever
that means :) ). But I think it would be less work to improve it than
to start from scratch.

>
> I've already been scoping this out as I had wanted to add output
> matrix functionality to scipy.spatial.pdist and scipy.spatial.cdist,
> which would make scenarios where distances are recomputed frequently
> (as in some sort of tracking application) much less memory-intensive.
> kmeans

I think this would be a great addition. You are of course free to
choose what you work on, but I like the idea of a basic set of
recursives implementations of basic statistics and clustering
algorithms. I have also myself an implementation of online EM for
online estimation of GMM, based on the following preprint:

http://www.citeulike.org/user/stibor/article/3245946

But again, "research quality" code. Does this idea of focusing your
proposal on the recursive side of things sounds appealing ?

cheers,

David



More information about the SciPy-Dev mailing list