[SciPy-Dev] Enhancements to scipy.spatial.cKDTree
Sturla Molden
sturla at molden.no
Fri Jul 13 11:08:46 EDT 2012
I have changed the code to use PyArray_DATA and squashed countless
32-bit vs. 64-bit integer bugs. I think it should work correctly on
64-bit now. All C integers are changed to npy_intp, and corresponding
dtype (np.int32 or np.int64) is inferred on module import.
Patrick: When you code Cython, observe that a Python int and np.int is a
C long on Python 2.x, which is 32-bit on Windows 64. It is very
important to make sure all integeres have the correct size. Run-time
inference is actually required. This is because in Cython we are often
using NumPy from the Python interface, and not using the NumPy C API
directly. dtype=int and dtype=np.int are almost never correct in Cython.
All cases for memory leaks due to exception handling are still there,
getting oversight is terrible...
(See Attachment, I don't know how to use git. Perhaps Patrick could put
this on Github?)
Sturla
On 13.07.2012 01:34, Patrick Varilly wrote:
> Alright, I've uploaded the last bit of cKDTree that was missing for it
> to be functionally equivalent to KDTree. As it stands, I think it's a
> useful addition in its own right, so it would be nice if someone else
> could look the code over and see if it can be merged in.
>
> Over the coming weeks, I will look into the issues that Sturla has
> brought up and see if I can make some progress on these.
>
> All the best,
>
> Patrick
>
> On Thu, Jul 12, 2012 at 5:42 PM, Sturla Molden <sturla at molden.no
> <mailto:sturla at molden.no>> wrote:
>
> On 12.07.2012 00:26, Patrick Varilly wrote:
> > On Tue, Jul 10, 2012 at 12:01 PM, Sturla Molden <sturla at molden.no
> <mailto:sturla at molden.no>
> > <mailto:sturla at molden.no <mailto:sturla at molden.no>>> wrote:
> >
> > At least cKDTree have to be fixed, it will break as soon as
> the move to
> > PyArray_DATA is mandatory.
> >
> > Preferably we should use Cython memoryviews and
> multidimensional arrays
> > in the code, instead of just C pointer artithmetics (which is
> harder to
> > understand). That will make the Cython code more readable to
> NumPy
> > users.
> >
> > The GIL issue should also be fixed, as searching might take a
> while.
> >
> > I'm relatively new to Cython. Could you tell me where I could
> read up
> > on these issues?
>
> The main issue is the use of the .data attribute. See here:
>
> http://wiki.cython.org/tutorials/NumpyPointerToC
>
> Another is that Cython's ndarray interface is (more or less) deprecated
> in favour of typed memoryviews:
>
> http://docs.cython.org/src/userguide/memoryviews.html
>
> So preferably the cKDTree code should use these, but I my experience
> they can generate compile-time warnings.
>
> There is also a 64-bit issue with cKDTree if I remember correctly. And
> the only dtype it supports is float64. We should replace the current
> pointer artimetics with multidimensional arrays. It had (or still has)
> non-portable code like dependency on unions and binary layout (tree and
> heap nodes). And there the issue of making it release the GIL whenever
> it should. So several things needs be fixed.
>
> Sturla
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org <mailto:SciPy-Dev at scipy.org>
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>
>
>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cKDTree.pyx
Type: /
Size: 95945 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20120713/c890a448/attachment.bin>
More information about the SciPy-Dev
mailing list