[SciPy-user] Looking for a way to cluster data

Zachary Pincus zachary.pincus at yale.edu
Sun Apr 26 19:51:55 EDT 2009


Hi Gary,

> I'm looking for some advice on how to order data points so that I can
> visualise them. I've been looking at scipy.cluster for this purpose  
> but
> I'm not sure whether it is suitable so I thought I'd see whether  
> anyone
> had suggestions for a simpler suggestion of how to order the  
> coordinates.
>
> I have a binary 3D array containing 1's that form a shape in a 3D  
> volume
> against a background of 0's - they form a skeleton of a connected,
> branched structure. Furthermore, the points are all 26-connected to  
> each
> other, i.e. there are no gaps in the skeleton. The longest chains  
> may be
> 1000's of points long.
> It would be nice to visualise these using the mayavi mlab plot3d
> function, which draws tubes and which requires ordered coordinates as
> input, so I need to get ordered coordinate lists that traverse the
> points along the branches of the skeleton. It would also be nice to
> preferentially cluster long chains since then I can cull very short
> chains from the visualisation.

It sounds like what you want from the clustering is to get groups of  
pixels that form straight-ish lines (and could thus be visualized with  
3D rods). Is this correct? Most clustering algorithms are likely to  
just give you groups of pixels that are nearby spatially -- which is  
probably not exactly what you want, since if you (say) have a long rod- 
structure, you'd want all the pixels in that rod grouped together, and  
not grouped with separate rods that cross nearby in space but aren't  
physically connected. So if you do want to cluster the pixels, you'll  
need to use the geodesic distance between two pixels, not the  
euclidian distance. But that still wouldn't give you sets of rods...  
more like rods and blobs at the junctions.

Another option would be to try to detect junction points on the  
skeleton (I think this is a common operation -- sometimes in 2D people  
brute-force it by looking for all possible patterns of pixels  
indicative of branching, but there's probably a nicer way, especially  
for 3d). Once the junctions are known, a simple flood-fill along the  
skeleton gives you sets of pixels that lie between each junction.

Having such a connectivity graph is a useful thing -- lots of neat  
stuff one can calculate from that. In fact, I think you'd need to do  
this to calculate geodesic distances anyway... Anyhow, you could then  
try fitting the points from each branch to a poly-line (using some  
information criterion for determining how many lines to use), to  
simplify the representation down to rods for plotting.

But perhaps it's worth figuring out if there's another visualization  
method you could use, before going to all this effort... Would volume  
rendering work? Perhaps you could go as far as junction-point-finding  
and branch-labeling. Just doing that would let you exclude short  
branches (or short terminal branches) and then you could simply volume- 
render the rest and not fiddle with line-fitting.

Zach


On Apr 25, 2009, at 11:18 PM, Gary Ruben wrote:

> Hi all,
>
> I'm looking for some advice on how to order data points so that I can
> visualise them. I've been looking at scipy.cluster for this purpose  
> but
> I'm not sure whether it is suitable so I thought I'd see whether  
> anyone
> had suggestions for a simpler suggestion of how to order the  
> coordinates.
>
> I have a binary 3D array containing 1's that form a shape in a 3D  
> volume
> against a background of 0's - they form a skeleton of a connected,
> branched structure. Furthermore, the points are all 26-connected to  
> each
> other, i.e. there are no gaps in the skeleton. The longest chains  
> may be
> 1000's of points long.
> It would be nice to visualise these using the mayavi mlab plot3d
> function, which draws tubes and which requires ordered coordinates as
> input, so I need to get ordered coordinate lists that traverse the
> points along the branches of the skeleton. It would also be nice to
> preferentially cluster long chains since then I can cull very short
> chains from the visualisation.
>
> scipy.cluster seems to be able to cluster the points but I'm not sure
> how to get the x,y,z coordinates of the original points out of its
> linkage data. This may not be possible. Maybe the scipy.spatial module
> is a better match to my problem.
>
> Any suggestions?
>
> Gary
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user




More information about the SciPy-User mailing list