[SciPy-User] kmeans2 question/issue

James Abel j at abel.co
Sun Aug 5 19:09:06 EDT 2012


Hi,
I'm trying to use scipy.cluster.vq.kmeans2() but I'm getting inconsistent
output.  With a simple test input that should have 3 clusters, I'm getting
good results most of the time but every so often the output creates the
wrong clustering.  If anyone could point to what I'm doing wrong I'd
appreciate it!
Code and sample output below.
Thanks!
James

Code:

import sys
import scipy
from scipy.cluster.vq import *

print sys.version
vals = scipy.array((0.0,0.1,0.5,0.6,1.0,1.1))
print vals
white_vals = whiten(vals)
print white_vals.shape, white_vals

# try it several times to see if we get similar answers
count = 0
while count < 5:
    res, idx = kmeans2(white_vals, 3) # changing iter doesn't seem to matter
    print res, idx
    count += 1

Output:

2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)]
[ 0.   0.1  0.5  0.6  1.   1.1]
(6,) [ 0.          0.24313227  1.21566135  1.45879362  2.4313227
2.67445496]
[ 0.12156613  2.55288883  1.33722748] [0 0 2 2 1 1]
[ 0.12156613  2.55288883  1.33722748] [0 0 2 2 1 1]
[ 1.33722748  2.55288883  0.12156613] [2 2 0 0 1 1]
[ 2.18819043  0.48626454 -0.97292963] [1 1 1 0 0 0] <-- unexpected result
[ 0.12156613  2.55288883  1.33722748] [0 0 2 2 1 1]
C:\PYTHON27\lib\site-packages\scipy\cluster\vq.py:588: UserWarning: One of
the clusters is empty. Re-run kmean with a different initialization.
  warnings.warn("One of the clusters is empty. "




More information about the SciPy-User mailing list