[SciPy-Dev] Interest in improvements to the cKDTree codebase?

Jesse Livezey jesse.livezey at gmail.com
Tue May 22 13:56:01 EDT 2018


Hi everyone,

I'm using cKDTrees for a project and noticed two potential ways to improve
the code and have written one additional count method that might be useful
to others.

I have written code and some tests for all three items and can contribute
if there is interest.

1) Allowing an array of rs in cKDTree.query_ball_point(). I started a PR
here <https://github.com/scipy/scipy/pull/8818>. In principle, this should
speed up multiple queries with different rs, but see 2.

2) I noticed that for the cases when cKDTree.query_ball_point() returns
large numbers of points (>100), it is faster to loop over queries in
python. This is true both with and without my PR. This is largely because
single point queries do not sort the return indices, but multi-point
queries DO sort them (see details here
<https://github.com/scipy/scipy/issues/8838>). Removing the sorted() leads
to considerable speedups, but is not backwards compatible. However, the
sorted behavior is not in the method description and is not even internally
consistent, so maybe it could just be removed or made optional?

3) I have written a cKDTree.count_ball_point() method that behaves like
query_ball_point() but just returns the number of points rather than their
indices. This is much faster than calling len(query_ball_point()).

Let me know if any of this is of interest.
Jesse

--
Jesse Livezey
he/him/his
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20180522/06ea36d4/attachment.html>


More information about the SciPy-Dev mailing list