[Numpy-discussion] "Match" two arrays
eat
e.antero.tammi at gmail.com
Thu Apr 1 06:54:47 EDT 2010
Shailendra <shailendra.vikas <at> gmail.com> writes:
>
> Hi All,
> I want to make a function which should be like this
> <code>
> cordinates1=(x1,y1) # x1 and y1 are x-cord and y-cord of a large
> number of points
> cordinates2=(x2,y2) # similar to condinates1
> indices1,indices2= match_cordinates(cordinates1,cordinates2)
> <code>
> (x1[indices1],y1[indices1]) "matches" (x2[indices2],y2[indices2])
>
> where definition of "match" is such that :
> If A is closest point to B and distance between A and B is less that
> delta than it is a "match".
> If A is closest point to B and distance between A and B is more that
> delta than there is no match.
> Every point has either 1 "match"(closest point) or none
This logic is problematic in general case. See below. You may need to be able
to handle several pairs of 'closest points'!
>
> Also, the size of the cordinates1 and cordinates2 are quite large and
> "outer" should not be used. I can think of only C style code to
> achieve this. Can any one suggest pythonic way of doing this?
>
> Thanks,
> Shailendra
>
This is straightforward implementation as a starting point.
eat
<code>
import numpy as np
def dist(p1, p2):
return np.sqrt(np.sum((p1- p2)** 2, 0))
def cdist(p1, p2, trh):
"""Expects 2d arrays p1 and p2, with combatible first dimesions
and a threshold.
Returns indicies of points close to each other
-ind[:, 0], array of p1 indicies
-ind[:, 1], array of p2 indicies
-ambi, list of list of ambiquous situations (where more
than 1 pair of points are 'equally close')
The indicies are aranged such that
dist(p1[:, ind[k, 0]], p2[:, ind[k, 1]])< trh
is true for all k.
"""
ind= []
ambi= []
for k in range(p2.shape[1]):
d= dist(p1, p2[:, None, k])
i= np.where(d< trh)[0]
if 0< len(i):
m= np.where(d[i]== d[i].min())[0] # problematic
i= i[m].tolist()
ind.append([i[0], k])
if 1< len(m):
ambi.append([ind[-1], i])
return np.array(ind), ambi
if __name__ == '__main__':
n= 10
trh= 2e-1
p1= np.round(np.random.rand(2, n), 1)
p2= np.round(p1+ 1e-1* np.random.randn(2, n), 1)
ind, ambi= cdist(p1, p2, trh)
print 'points close to each other:'
if 0< len(ind):
print 'p1:'
print p1[:, ind[:, 0]], ind[:, 0]
print 'p2:'
print p2[:, ind[:, 1]], ind[:, 1]
print 'valid:'
print dist(p1[:, ind[:, 0]], p2[:, ind[:, 1]])< trh
print 'with ambiguous situation(s):'
if ambi:
print ambi
else:
print 'None'
else:
print 'None'
import timeit
n= 1e2
trh= 2e-1
rep= 5
p1= np.random.rand(2, 1e3* n)
p2= np.random.randn(2, n)
def perf():
ind, ambi= cdist(p1, p2, trh)
print 'performance:'
t= np.array(timeit.repeat(perf, repeat= rep, number= 1))/ rep
print 'min: ', t.min(), 'mean: ', t.mean(), 'max: ', t.max()
<code>
More information about the NumPy-Discussion
mailing list