Best match searching

Paul McGuire ptmcg at austin.rr.com
Mon Dec 29 10:47:18 EST 2003


"Daniel Pryde" <ababo_2002 at hotmail.com> wrote in message
news:mailman.3.1072700087.8134.python-list at python.org...
> Hi everyone.
> I was wondering if anyone might be able to help me out here. I'm currently
> looking to find the quickest way to find a best fit match in a large
array.
> My problem is that I have an array of, say, 600*400, which contains a
value
> at each point, and I need to find the value in that array which is closest
> to the input value.
<snip>
> Any hints or tips would be really helpful. :-)
>
> Daniel
I think inverting your matrix to a dictionary may be more "Pythonic".  The
included snippet constructs an array of values, then creates a dictionary of
which values are located at which positions in the array (also handles
possibility of duplicates in the array).  If you change ROWS and COLS to
your problem values of 400 and 600, the matrix-to-dictionary inversion takes
a bit of time - but the searches are blindingly fast.

-- Paul


# mat2Dict.py
from random import random
import pprint

ROWS = 4
COLS = 6

# create matrix
print "creating data matrix"
matrix = [ [ random() for i in range(COLS) ] for j in range(ROWS) ]

print "create dictionary of values"
valdict = {}
for i,row in enumerate(matrix):
    for j,elem in enumerate(row):
        if valdict.has_key(elem):
            valdict[elem].append( (i,j) )
        else:
            valdict[elem] = [ (i,j) ]

# uncomment this line to view the dictionary of values
# pprint.pprint( valdict )

matvals = valdict.keys()
matvals.sort()
print len(matvals),"unique values in matrix"

def findClosestVal( searchval ):
    if searchval <= matvals[0]:
        return matvals[0]
    if searchval >= matvals[-1]:
        return matvals[-1]

    # do binary search - see
http://www.tbray.org/ongoing/When/200x/2003/03/22/Binary
    hi = len(matvals)
    lo = -1
    while (hi-lo) > 1:
        loc = ( hi + lo ) / 2
        if matvals[loc] > searchval:
            hi = loc
        else:
            lo = loc

    if abs(searchval-matvals[lo]) < abs(searchval-matvals[hi]):
        return matvals[lo]
    else:
        return matvals[hi]

# look up some values
for v in range(10):
    vv = v/10.
    closestVal = findClosestVal( vv )
    print vv, "->", closestVal, "occurring at", valdict[closestVal]









More information about the Python-list mailing list