[SciPy-Dev] GSoC Draft Proposal: Rewrite and improve cluster package in Cython

Richard Tsai richard9404 at gmail.com
Tue Apr 22 02:59:04 EDT 2014


2014-03-21 22:18 GMT+08:00 Richard Tsai <richard9404 at gmail.com>:

> Hi all,
>
> I've posted my proposal to melange but there's still some potential
> features to the package (cluster) I want to discuss here.
>
> The first one is about the stopping criterion of kmeans/kmeans. These two
> functions are using the average distance from observations to their
> corresponding centroids currently. But a more accurate exiting condition
> will be the average *squared* distance. Besides, the average centroids
> moving distance, and the changes of the results of vq are both better than
> the original one.
> Second, finding convex hulls of hierarchical clustering seems interesting
> but I'm not sure if there's a demand for it.
> The third one is gap statistics for automatic determination of k in
> kmeans. David supposed that it should be scikit-learn territory and I plan
> to put it to the end.
>
> I'm not sure if these features are proper to be integrated into cluster
> and Ralf doubts that there's some overlap with scikit-learn so I post them
> here to discuss at his suggestion. I've also made my proposal public:
> http://www.google-melange.com/gsoc/proposal/public/google/gsoc2014/richardtsai/5629499534213120
> Comments/suggestions are welcome.
>
> Regards,
> Richard
>

Hi all,

I've received emails from GSoC saying that my proposal has been
accepted. Thanks
to those who have help me with my application!

I'll submit the required materials soon then make a more detailed plan and
prepare for coding. If you have any thoughts about my project, please
discuss with me!

Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20140422/e4f52f73/attachment.html>


More information about the SciPy-Dev mailing list