[SciPy-User] Hierarchical Clustering

Kelson Zawack kelson924 at aol.com
Sat Jan 19 04:20:56 EST 2013


I have a matrix of n observations of length m I would like to cluster.  
The documentation for scipy.cluster.hierarchy.linkage says it takes 'A 
condensed or redundant distance matrix... Alternatively, a collection of 
m observation vectors in n dimensions may be passed as an m by n array.' 
I tried passing in the condensed matrix returned from 
scipy.spatial.distance.pdist, the matrix returned from calling 
scipy.spatial.distance.squarefrom on the previous matrix, and the raw 
data matrix along with single link as the method and euclidean as the 
distance measure and got 3 different answers.  I then tried it with toy 
data like observations of
[[0,0,0,],
[1,1,1],
[5,5,5],
[6,6,6]]

and they all give the same answer.  This makes me very nervous. What is 
the correct way the call the function and how can I be sure of this?

Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20130119/06ac0ba0/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: f5047d1e0cbb50ec208923a22cd517c55100fa7b.png
Type: image/png
Size: 216 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20130119/06ac0ba0/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 174fadd07fd54c9afe288e96558c92e0c1da733a.png
Type: image/png
Size: 202 bytes
Desc: not available
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20130119/06ac0ba0/attachment-0001.png>


More information about the SciPy-User mailing list