Bounding box on clusters in a 2D list
bearophileHUGS at lycos.com
bearophileHUGS at lycos.com
Sat Apr 23 21:15:34 EDT 2005
I hope this is what you need, sometimes understanding the question is
one of the hardest parts :-)
If you can use a graph data structure, you can create a graph, and then
you can find the lenght of all its connected components (indentations
reduced to 2 spaces):
. def mat2graph(g, m, good=None, w=1):
. "Function docs..."
. g.clear()
. if m and isinstance(m, (list, tuple)):
. if good != None:
. m = [[good(e) for e in row] for row in m]
. maxy = len(m)-1
. maxx = len(m[0])-1
. add = g.addBiArc
. for y, row in enumerate(m):
. for x, e in enumerate(row):
. if e:
. g.addNode((x,y)) # Necessary for isolated nodes.
. if x>0 and m[y][x-1]: add( (x,y), (x-1,y), w=w)
. if x<maxx and m[y][x+1]: add( (x,y), (x+1,y), w=w)
. if y>0 and m[y-1][x]: add( (x,y), (x,y-1), w=w)
. if y<maxy and m[y+1][x]: add( (x,y), (x,y+1), w=w)
.
. from graph import Graph
. g = Graph()
. m = [[1,1,1,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0],
. [1,1,1,0,1,1,0,0,0,0,0,1,1,1,1,0,0,0],
. [1,1,1,0,1,1,1,0,1,1,0,1,1,1,0,0,0,0],
. [0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0]]
. mat2graph(g, m)
. print map(len, g.connectedComponents())
Something similar to mat2graph will probably become a method of Graph.
A quite similar program can be used for other python graph libraries.
If you cannot use a graph data structure, of if you need to go faster,
you can use a flood fill algorithm tuned for this problem:
. def connected(m, foreground=1, background=0):
. def floodFill4(m, r,c, newCol, oldCol):
. """Simplest 4-connected flood fill algorithm, there are much
. faster non-recursive ones. I've modified this from:
. http://www.student.kuleuven.ac.be/~m0216922/CG/floodfill.html"""
. if c>=0 and c<maxc and r>=0 and r<maxr and m[r][c]==oldCol:
. m[r][c] = newCol
. floodFill4(m, r+1, c, newCol, oldCol)
. floodFill4(m, r-1, c, newCol, oldCol)
. floodFill4(m, r, c+1, newCol, oldCol)
. floodFill4(m, r, c-1, newCol, oldCol)
.
. maxr = len(m)
. maxc = len(m[0])
. newCol = 2
. oldCol = foreground
. for r in xrange(maxr):
. for c in xrange(maxc):
. if m[r][c] == oldCol:
. floodFill4(m, r,c, newCol=newCol, oldCol=oldCol)
. newCol += 1
. # Frequences, this can be become a standard python command
. freq = {}
. for row in m:
. for e in row:
. if e in freq:
. freq[e] += 1
. else:
. freq[e] = 1
. del freq[background]
. return freq.values()
.
. m = [[1,1,1,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0],
. [1,1,1,0,1,1,0,0,0,0,0,1,1,1,1,0,0,0],
. [1,1,1,0,1,1,1,0,1,1,0,1,1,1,0,0,0,0],
. [0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0]]
. print connected(m)
There are much faster ways to do flood fills:
http://www.codeproject.com/gdi/QuickFill.asp
I think Python PIL doesn't support flood fills, and I think doing it
with Numarray is slower, and it's useful only to reduce the used
memory.
This connected program is slow and raw, and it uses recursivity a lot,
so it can crash the interpreter. You can use a data structure to
implement the stack yourself, to speed this program up. But for small
matrices I think this connected is enough ^_^
Bear hugs,
Bearophile
More information about the Python-list
mailing list