[SciPy-User] kmeans

David Cournapeau cournape at gmail.com
Fri Jul 23 13:27:56 EDT 2010


On Sat, Jul 24, 2010 at 2:19 AM, Benjamin Root <ben.root at ou.edu> wrote:

>
> Examining further, I see that SciPy's implementation is fairly simplistic
> and has some issues.  In the given example, the reason why 3 is never
> returned is not because of the use of the distortion metric, but rather
> because the kmeans function never sees the distance for using 3.  As a
> matter of fact, the actual code that does the convergence is in vq and py_vq
> (vector quantization) and it tries to minimize the sum of squared errors.
> kmeans just keeps on retrying the convergence with random guesses to see if
> different convergences occur.

As one of the maintainer of kmeans, I would be the first to admit the
code is basic, for good and bad. Something more elaborate for
clustering may indeed be useful, as long as the interface stays
simple.

More complex needs should turn on scikits.learn or more specialized packages,

cheers,

David



More information about the SciPy-User mailing list