[SciPy-user] Looking for a way to cluster data

Damian Eads eads at soe.ucsc.edu
Tue May 12 21:38:08 EDT 2009


Hi Gary,

On Sat, Apr 25, 2009 at 8:18 PM, Gary Ruben <gruben at bigpond.net.au> wrote:
> Hi all,
>
> I'm looking for some advice on how to order data points so that I can
> visualise them. I've been looking at scipy.cluster for this purpose but
> I'm not sure whether it is suitable so I thought I'd see whether anyone
> had suggestions for a simpler suggestion of how to order the coordinates.

With the dendrogram function, the order nodes appear from
left-to-right can be change with the distance_sort or count_sort
functions.

> I have a binary 3D array containing 1's that form a shape in a 3D volume
> against a background of 0's - they form a skeleton of a connected,
> branched structure. Furthermore, the points are all 26-connected to each
> other, i.e. there are no gaps in the skeleton. The longest chains may be
> 1000's of points long.
> It would be nice to visualise these using the mayavi mlab plot3d
> function, which draws tubes and which requires ordered coordinates as
> input, so I need to get ordered coordinate lists that traverse the
> points along the branches of the skeleton. It would also be nice to
> preferentially cluster long chains since then I can cull very short
> chains from the visualisation.
>
> scipy.cluster seems to be able to cluster the points but I'm not sure
> how to get the x,y,z coordinates of the original points out of its
> linkage data. This may not be possible.

The rows of the linkage matrix are the clusters and the first two
columns of the linkage matrix are the indices of the left and right
node, respectively. If the index is less than the number of points
clustered (i < N), it's a leaf node (original point/singleton
cluster), otherwise it's a non-singleton cluster (i >= N). Note, that
there are always (N-1) non-singleton clusters, so the linkage matrix
will always have N-1 rows.


> Maybe the scipy.spatial module
> is a better match to my problem.

I haven't had the chance to read this part of the discussion but I
hope my answer to your question helps.

Cheers,

Damian

-----------------------------------------------------
Damian Eads                             Ph.D. Candidate
Jack Baskin School of Engineering, UCSC        E2-489
1156 High Street                 Machine Learning Lab
Santa Cruz, CA 95064    http://www.soe.ucsc.edu/~eads



More information about the SciPy-User mailing list