[SciPy-dev] [Numpy-discussion] Proposal: scipy.spatial

Anne Archibald peridot.faceted at gmail.com
Sun Oct 5 04:56:37 EDT 2008


2008/10/4 Prabhu Ramachandran <prabhu at aero.iitb.ac.in>:
> Anne Archibald wrote:
>> 2008/10/4 Prabhu Ramachandran <prabhu at aero.iitb.ac.in>:
>>> In any case, I am not sure there is a general need for this kind of data
>>> structure.  I need it a whole lot.  If there is a general need I'll be
>>> glad to try and extract any of my code and contribute it to scipy.spatial.
>>
>> There does seem to be a general need for a two-tree query that
>> extracts (or just counts) all items of one tree within a ball centered
>> at each item of another tree. You're right, too, that "in a ball"
>> queries often don't naturally produce sorted results, or even known
>> distances. But I think this has to be a different query. In fact, even
>> for single-point queries it may make sense to have a separate query
>> function for in-a-ball queries.
>
> True and another reason for a different approach would be performance.
> According to Lee and Wong (1970, Acta Informatica) the worst case
> performance for partial region searches on k-d trees is O(k N^{1-1/k}).
>  So if the dimensionality is small and the region size is know we could
> do a lot better with other specialized algorithms for these cases.

I have added several in-a-ball queries to the kd-tree implementation
in the spatial branch. Please let me know what you think. The API is a
bit ad-hoc, since I can see a number of different ways one might want
to use the code. In particular, some query functions do not currently
return the actual distance, since one of the shortcuts is to include
whole branches of the tree if their bounding box lies inside the ball
of interest. I suspect that for many algorithms of interest it will be
necessary to write a custom tree-walking algorithm. I'm not sure
whether it makes sense to try to provide functions to support this.

Anne



More information about the SciPy-Dev mailing list