[SciPy-Dev] Enhancements to scipy.spatial.cKDTree
Patrick Varilly
patvarilly at gmail.com
Mon Jul 16 14:50:39 EDT 2012
> Also observe that Cython has closures now, and you can nest Python
> functions in Cython (and declaring them cpdef makes the call almost as
> efficient as a cdef).
>
By this, do you mean putting back the workhorse "traverse" methods inside
their respective function (e.g., query_ball_tree), like KDTree does? I
would certainly like to do that (it would save passing in tons of arguments
in the recursive calls), but I tried the following test and got the Cython
error "C function definition not allowed here" (here being "print_i()"):
cdef class Test(object):
cpdef test():
cdef int i = 2
cpdef print_i():
print i
print_i()
* There seems to be something funny about raw_mins and raw_maxes: they are
member variables of cKDTree, but the data they point to seems to me like it
should go out of scope at the end of cKDTree.__init__. Am I
misunderstanding something here?
> Yes you do :) See below:
>
>
> cdef class cKDTree:
>
> cdef readonly object maxes
>
> def __init__(cKDTree self):
>
> cdef np.ndarray[np.float64_t, ndim=1] inner_maxes
>
> self.maxes = np.ascontiguousarray(np.amax(self.data,axis=0),
> dtype=np.float64)
>
> inner_maxes = self.maxes
>
> self.raw_maxes = <np.float64_t*>np.PyArray_DATA(inner_maxes)
>
>
> "maxes" stores the contiguous array as an attribute of the cKDTree
> extension object. It will be alive until the object of type cKDTree is
> reclaimed.
>
> "inner_maxes" unboxes the array struct to a collection of local variables
> for fast indexing access inside cKDTree. It also tells Cython that the
> type is np.ndarray (which extends NumPy's PyArrayObject).
>
> Then finally "raw_maxes" gets the C pointer to the contiguous buffer
> stored in "maxes", by using the unboxed "inner_maxes".
>
>
> We could also have done this (at least for Python 2.x):
>
> self.raw_maxes = <np.float64_t*> cpython.PyLong_AsVoidPtr(
> self.maxes.__array_interface__['data'][0])
>
> or this
>
> self.raw_maxes = &inner_maxes[0]
>
> But the current "inner_maxes.data" should not be used.
>
Thanks, this makes more sense. I thought there might be a circumstance
when "inner_maxes = maxes" might create a copy of maxes, but I now see that
it will only check that ndim is 1 and dtype is np.float64 before making the
data available.
>
>
> Another issue is this:
>
> Where ever you use list.append and set.add to save a tuple of indexes
> (i,j), do this:
>
>
> if sizeof(long) < sizeof(np.npy_intp):
> # Win 64
> if self.raw_indices[i] < self.raw_indices[j]:
> results.add((int(self.raw_indices[i]), int(self.raw_indices[j])))
> else:
> results.add((int(self.raw_indices[j]), int(self.raw_indices[i])))
>
> else:
> # Other platforms
> if self.raw_indices[i] < self.raw_indices[j]:
> results.add((self.raw_indices[i], self.raw_indices[j]))
> else:
> results.add((self.raw_indices[j], self.raw_indices[i]))
>
>
> sizeof(long) < sizeof(np.npy_intp) is a compile time constant so it will
> be optimized by the compiler. Then you can take out the stuff I added to
> fix the set for Win64 before the return of query_pairs.
>
> You might want to put this into two inline functions (for list.append and
> set.add) to avoid repeating yourself.
>
>
> cdef inline int set_add_pair(set results, np.npy_intp i, np.npy_intp j)
> except -1:
> if sizeof(long) < sizeof(np.npy_intp):
> # Win 64
> if i < j:
> results.add((int(j), int(j)))
> else:
> results.add((int(j), int(i)))
> else:
> # Other platforms
> if i < j:
> results.add((i, j))
> else:
> results.add((j, i))
> return 0
>
>
> The difference is that on Win 64 we should use Python long istead of
> Python int if a C long (i.e. Pyhton int) overflows, which the function
> int() will ensure. Cython automatically converts np.npy_intp to Python
> long on Win 64, which we want to convert to a Python int if it is possible.On other platforms we don't want this extra overhead.
>
Okay, sounds good, I'll do this.
Best,
Patrick
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20120716/b0c35828/attachment.html>
More information about the SciPy-Dev
mailing list