[Numpy-discussion] Fastest distance matrix calc

Tue Apr 17 16:24:37 EDT 2007

Oops.  Looks like I forgot to attach the test program that generated
that output so you can tell what dist2g actually does.
Funny thing is -- despite being written in C, hypot doesn't actually
win any of the test cases for which it's applicable.

--bb

On 4/17/07, Bill Baxter <wbaxter at gmail.com> wrote:
> Here's a bunch of dist matrix implementations and their timings.
> The upshot is that for most purposes this seems to be the best or at
> least not too far off (basically the cookbook solution Kier posted)
>
> def dist2hd(x,y):
>     """Generate a 'coordinate' of the solution at a time"""
>     d = npy.zeros((x.shape[0],y.shape[0]),dtype=x.dtype)
>     for i in xrange(x.shape[1]):
>         diff2 = x[:,i,None] - y[:,i]
>         diff2 **= 2
>         d += diff2
>     npy.sqrt(d,d)
>     return d
>
> The only place where it's far from the best is for a small number of
> points (~10) with high dimensionality (~100), which does come up in
> machine learning contexts.  For those cases this does much better
> (factor of :
>
> def dist2b3(x,y):
>     d = npy.dot(x,y.T)
>     d *= -2.0
>     d += (x*x).sum(1)[:,None]
>     d += (y*y).sum(1)
>     # Rounding errors occasionally cause negative entries in d
>     d[d<0] = 0
>     # in place sqrt
>     npy.sqrt(d,d)
>     return d
>
> So given that, the obvious solution (if you don't want to delve into
> non-numpy solutions) is  to use a hybrid that just switches between
> the two.  Not sure what the proper switch is since it seems kind of
> complicated, and probably depends some on cache specifics.  But just
> switching based on the dimension of the points seems to be pretty
> effective:
>
> def dist2hy(x,y):
>     if x.shape[1]<5:
>         d = npy.zeros((x.shape[0],y.shape[0]),dtype=x.dtype)
>         for i in xrange(x.shape[1]):
>             diff2 = x[:,i,None] - y[:,i]
>             diff2 **= 2
>             d += diff2
>         npy.sqrt(d,d)
>         return d
>
>     else:
>         d = npy.dot(x,y.T)
>         d *= -2.0
>         d += (x*x).sum(1)[:,None]
>         d += (y*y).sum(1)
>         # Rounding errors occasionally cause negative entries in d
>         d[d<0] = 0
>         # in place sqrt
>         npy.sqrt(d,d)
>         return d
>
> All of this assumes 'C' contiguous data.  All bets are off if you have
> non-contiguous or 'F' ordered data.  And maybe if x and y have very
> different numbers of points.
>
>
> --bb
>
>
>
> On 4/17/07, Keir Mierle <mierle at gmail.com> wrote:
> > On 4/13/07, Timothy Hochberg <tim.hochberg at ieee.org> wrote:
> > > On 4/13/07, Bill Baxter <wbaxter at gmail.com> wrote:
> > > > I think someone posted some timings about this before but I don't recall.
> >
> > http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/498246
> >
> > [snip]
> > > I'm going to go out on a limb and contend, without running any timings, that
> > > for large M and N, a solution using a for loop will beat either of those.
> > > For example (untested):
> > >
> > >  results = empty([M, N], float)
> > > # You could be fancy and swap axes depending on which array is larger, but
> > > # I'll leave that for someone else
> > > for i, v in enumerate(x):
> > >     results[i] = sqrt(sum((v-y)**2, axis=-1))
> > >  Or something like that. The reason that I suspect this will be faster is
> > > that it has better locality, completely finishing a computation on a
> > > relatively small working set before moving onto the next one. The one liners
> > > have to pull the potentially large MxN array into the processor repeatedly.
> >
> > In my experience, it is indeed the case that the for loop version is
> > faster. The fastest of the three versions offered in the above url is
> > the last:
> >
> > from numpy import mat, zeros, newaxis
> > def calcDistanceMatrixFastEuclidean2(nDimPoints):
> >     nDimPoints = array(nDimPoints)
> >     n,m = nDimPoints.shape
> >     delta = zeros((n,n),'d')
> >     for d in xrange(m):
> >         data = nDimPoints[:,d]
> >         delta += (data - data[:,newaxis])**2
> >     return sqrt(delta)
> >
> > This is easily extended to two different nDimPoints matricies.
> >
> > Cheers,
> > Keir
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at scipy.org
> > http://projects.scipy.org/mailman/listinfo/numpy-discussion
> >
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dist2perf.py
Type: text/x-python
Size: 5359 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20070418/f0a7b462/attachment.py>