[SciPy-user] Scipy for Cluster Detection

Thu Dec 27 01:30:01 EST 2007

On Dec 26, 2007 6:28 PM, Lorenzo Isella <lorenzo.isella at gmail.com> wrote:
> Two particles are considered to be bounded if their distance falls below
> a certain threshold d.

Assuming d is the same for all pairs of contacts you should probably
use spatial hashing.  I had a similar problem in the simulation of
granular materials and found spatial hashing to be the fastest
approach.
By spatial hashing I mean that to each point (x,y,z) you associate the
integer tuple ( floor( x/d ), (floor( y/d ), (floor( z/d ) ).  When
looking for all the neighbors of a point you simply look at all 27
neighboring tuples.  You could do this with Python dictionaries, but
you'll probably prefer something like the STL hash_map or Google's
sparsehash library.

If d varies then a k-D tree is probably your best best.

> A cluster is nothing else than a group of particles all directly or
> indirectly bound together.
> Having said so, I now need an efficient algorithm to look for clusters,
> since at the very least I need to be able to count them to study how the
> number of clusters evolve in time.
> Is there anything suitable for this already implemented in SciPy? I
> wonder if people in Astronomy have similar problems if they need e.g. to
> detect/study clusters of stars.
> Simply I would like not to re-invent the wheel and it goes without
> saying that any suggestions here are really appreciated.
> Many thanks and a nice Xmas break to everybody on this list.

I don't think SciPy has the functionality you need.  The cluster
module has k-means stuff, but I don't think that will help you.

-- 
Nathan Bell wnbell at gmail.com