[SciPy-user] (OT) Grouping of data points.

Stephen Walton stephen.walton at csun.edu
Wed Nov 30 16:10:03 EST 2005


Bill Dandreta wrote:

>This is probably not the best place to ask this question but maybe 
>someone can point me in the right direction.
>
>I have a set of 2-D data points. Is there an algorithm for selecting 
>subsets of data points that have nearly equal spacing in one dimension. 
>  
>
You need graph theory.  I just did a similar project (in MATLAB, but 
anyway) having to do with sunspots.  I needed to put sunspots into 
groups by finding sunspots which were "close" in latitude and 
longitude.  The eventual algorithm looked like this:

1.  For each point i, compute its distance from point j.  (This part of 
the algorithm is n**2, unfortunately.)  Set A[i,j] and A[j,i] to 1 if 
the distance is less than some value, and 0 if it is greater.  A is in 
fact an adjacency graph in MATLAB-ese, where A[i,j] is nonzero if vertex 
i of a graph is connected to vertex j by an edge.  "Distance" here can 
have any definition of course;  it doesn't have to be Cartesian.

2. Find the components in the resulting graph, defined as those sets of 
vertices which are in fact connected.  The algorithm for doing this is 
at http://www.ececs.uc.edu/~gpurdy/lec20.html, in section 20.2.

I also recommend the graph theory tutorial at 
http://www.utm.edu/departments/math/graph.

Stephen Walton




More information about the SciPy-User mailing list