[SciPy-Dev] Enhancements to scipy.spatial.cKDTree

Sturla Molden sturla at molden.no
Fri Jul 13 11:08:46 EDT 2012


I have changed the code to use PyArray_DATA and squashed countless 
32-bit vs. 64-bit integer bugs. I think it should work correctly on 
64-bit now. All C integers are changed to npy_intp, and corresponding 
dtype (np.int32 or np.int64) is inferred on module import.

Patrick: When you code Cython, observe that a Python int and np.int is a 
C long on Python 2.x, which is 32-bit on Windows 64. It is very 
important to make sure all integeres have the correct size. Run-time 
inference is actually required. This is because in Cython we are often 
using NumPy from the Python interface, and not using the NumPy C API 
directly. dtype=int and dtype=np.int are almost never correct in Cython.

All cases for memory leaks due to exception handling are still there, 
getting oversight is terrible...

(See Attachment, I don't know how to use git. Perhaps Patrick could put 
this on Github?)

Sturla





On 13.07.2012 01:34, Patrick Varilly wrote:
> Alright, I've uploaded the last bit of cKDTree that was missing for it
> to be functionally equivalent to KDTree.  As it stands, I think it's a
> useful addition in its own right, so it would be nice if someone else
> could look the code over and see if it can be merged in.
>
> Over the coming weeks, I will look into the issues that Sturla has
> brought up and see if I can make some progress on these.
>
> All the best,
>
> Patrick
>
> On Thu, Jul 12, 2012 at 5:42 PM, Sturla Molden <sturla at molden.no
> <mailto:sturla at molden.no>> wrote:
>
>     On 12.07.2012 00:26, Patrick Varilly wrote:
>      > On Tue, Jul 10, 2012 at 12:01 PM, Sturla Molden <sturla at molden.no
>     <mailto:sturla at molden.no>
>      > <mailto:sturla at molden.no <mailto:sturla at molden.no>>> wrote:
>      >
>      >     At least cKDTree have to be fixed, it will break as soon as
>     the move to
>      >     PyArray_DATA is mandatory.
>      >
>      >     Preferably we should use Cython memoryviews and
>     multidimensional arrays
>      >     in the code, instead of just C pointer artithmetics (which is
>     harder to
>      >     understand). That will make the Cython code more readable to
>     NumPy
>      >     users.
>      >
>      >     The GIL issue should also be fixed, as searching might take a
>     while.
>      >
>      > I'm relatively new to Cython.  Could you tell me where I could
>     read up
>      > on these issues?
>
>     The main issue is the use of the .data attribute. See here:
>
>     http://wiki.cython.org/tutorials/NumpyPointerToC
>
>     Another is that Cython's ndarray interface is (more or less) deprecated
>     in favour of typed memoryviews:
>
>     http://docs.cython.org/src/userguide/memoryviews.html
>
>     So preferably the cKDTree code should use these, but I my experience
>     they can generate compile-time warnings.
>
>     There is also a 64-bit issue with cKDTree if I remember correctly. And
>     the only dtype it supports is float64. We should replace the current
>     pointer artimetics with multidimensional arrays. It had (or still has)
>     non-portable code like dependency on unions and binary layout (tree and
>     heap nodes). And there the issue of making it release the GIL whenever
>     it should. So several things needs be fixed.
>
>     Sturla
>     _______________________________________________
>     SciPy-Dev mailing list
>     SciPy-Dev at scipy.org <mailto:SciPy-Dev at scipy.org>
>     http://mail.scipy.org/mailman/listinfo/scipy-dev
>
>
>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev

-------------- next part --------------
A non-text attachment was scrubbed...
Name: cKDTree.pyx
Type: /
Size: 95945 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20120713/c890a448/attachment.bin>


More information about the SciPy-Dev mailing list