[Numpy-discussion] finding close together points.

Wed Nov 11 01:00:13 EST 2009

On Tue, Nov 10, 2009 at 10:51 PM, Christopher Barker
<Chris.Barker at noaa.gov>wrote:

> Anne Archibald wrote:
> > 2009/11/10 Christopher Barker <Chris.Barker at noaa.gov>:
>
> >> I have a bunch of points in 2-d space, and I need to find out which
> >> pairs of points are within a certain distance of one-another (regular
> >> old Euclidean norm).
> >
> > This is an eminently reasonable thing to want, and KDTree should
> > support it. Unfortunately it doesn't.
>
> darn.
>
> > It's slow both because you're using the python implementation rather
> > than the C,
>
> I noticed that.
>
> > The one good thing about using the python implementation rather than
> > the Cython one is that you can subclass it, providing a new method.
> > There's still a certain amount of boilerplate code to write, but it
> > shouldn't be too bad.
>
> Can you give me a hint as to where to start?  I have no idea how kd
> trees work.
>
> > If this is still too slow, I have no problem incorporating additional
> > code into cKDTree; the only reason none of the ball queries are in
> > there is because ball queries must return variable-size results, so
> > you lose a certain amount of speed because you're forced to manipulate
> > python objects. But if there are relatively few result points, this
> > need not be much of a slowdown.
>
> And certainly in this case, there would not be very many results, and
> lists are pretty fast.
>
> However, this does bring up the need for a growable numpy array. I've
> been working on one in Python, but it would be really nice to have one
> that could be accessed by C (or cython) code.
>
> It wouldn't be that hard to do (for someone familiar with the numpy
> code, anyway), I don't think. It is mostly boiler plate in Python. I'm
> imagining it would only support contiguous arrays, and could only grow
> is the first dimension.
>
>
>
I think Python lists are basically just expanding arrays and pointers are
cheap. Where you might lose is in creating python objects to put in the list
and not having ufuncs and the rest of the numpy machinery. If you don't need
the machinery, lists are probably not a bad way to go.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20091110/e29dcb75/attachment.html>