nearest neighbor in 2D

Ron Adam radam2 at tampabay.rr.com
Tue Nov 4 15:20:12 EST 2003


On Tue, 04 Nov 2003 08:25:36 -0800, David Eppstein
<eppstein at ics.uci.edu> wrote:

>In article <bo8j4v$tqt$03$1 at news.t-online.com>,
> Peter Otten <__peter__ at web.de> wrote:
>
>> This is of course all premature optimization as the most promising approach
>> is to try hard to reduce the number of candidate points, as David Eppstein
>> seems to have done. But then, he could use complex numbers, too.
>
>Well, yes, but then my code wouldn't work very well in dimensions higher 
>than two...


I rearranged my first example to match the output of yours and used a
random number seed to get identical results.

Moving the square root to the return line of the find shortest
distance function increased the speed of my routine about 20%.  Then
using the p*p form instead of p**2 added anouther 4%.

With printing turned there is only a very small difference.  Of course
printing is the bottle neck.  Turning off printing resulted in the
following.  All are best of 3 runs.

1000 points:
	Standard loop:		0:00:00.958192
	Kdtree:			0:00:00.248096

Quite a difference.   I'm not quite sure how kdtree's work. (yet)  But
they can be worth while when working with large data sets.

The standard loop seems to be better for small lists of  < 100 points.

100 points:
	Standard loop:		0:00:00.009966
	kdtree			0:00:00.015247

But for really large lists.

10000 points:
	Standard loop:		0:01:39.246454
	kdtree			0:00:03.424873

Hehe.... no comparison.

The number of calculations the standard loop does:

100 new points, 4950 distance calculations.
1000 new points, 499500 distance calculations.
10000 new points, 49995000 distance calculations.

And I don't know how to figure it for kdtree.  But we can estimate it
by using the ratio of the speeds. 

1000 points:
kdtree	 (3.42/99.25)*49995000 = 1722749.62 est. dist. calculations. 

There's probably a better way to do that.   Python is fun to do this
stuff with.  Playing around like this with other languages is just too
much trouble.

_Ron

















More information about the Python-list mailing list