[SciPy-Dev] Faster implementation of cluster.hierarchy
Conrad Lee
conradlee at gmail.com
Wed Oct 12 07:12:18 EDT 2011
A mathematician at Stanford named Daniel Müllner recently came up with a
package that implements the hierarchical clustering methods found in
scipy.cluster.hierarchy. His implementation is in C++, but includes a
python API that uses the same interface as scipy.cluster.hierarchy.
Müllner has posted benchmarks as well as algorithmic explanations of why his
implementation is faster in a paper on arXiv<http://arxiv.org/abs/1109.2378>.
He also has a webpage that describes the package
here<http://math.stanford.edu/~muellner/fastcluster.html>
.
Because the results of the benchmarks look good, I am interested in getting
the scikit-learn package to use this implementation for the hierarchical
clustering provided by that package. Rather than integrate the code in
scikit-learn, it seems more appropriate to integrate it upstream in
scipy.cluster.hierarchy. Is there anyone who is interested in this
integration? I am inexperienced with integrating C++ code and python code,
and also with how things work in the scipy project, so I'm not sure how to
proceed.
Note: Although Müllner's code is currently under a GPL license, he has
stated to me in e-mail that he would be willing to put it under the BSD-2
license it somebody put the time to integrate it into scipy.
Best regards,
Conrad Lee
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20111012/57dc46fa/attachment.html>
More information about the SciPy-Dev
mailing list