[SciPy-Dev] Enhancements to scipy.spatial.cKDTree
Sturla Molden
sturla at molden.no
Fri Jul 13 07:02:09 EDT 2012
I've quickly looked at the code, most of it looks good.
Care if I help?
I'd like to focus on these changes first:
** Make sure the code behaves correctly on 64-bit. Currently there are
numerous possibilities for integer overflow, as C int is used
consistently instead of npy_intp or Py_ssize_t.
** Make sure malloc and free does not leak memory in case of a Python
exception. This means either using try/finally religiously or wrap it in
a Python object that use "RAII" with __cinit__, __init__ and
__dealloc__. Currently it is littered with possible memory leaks.
** Raise MemoryError on malloc failure.
** Use PyMem_Malloc/PyMem_Free instead of libc, so we are sure to use
the same heap as Python.
** Correct the "ndarray.data" issue, but I will refrain from using typed
memoryviews to maintain compatibility with Python 2.4. That will
probably mean to use the Python API directly (PyArrayObject and
PyArray_DATA) instead of Cython's numpy.pxd.
** Use Cython's fused types (a bloatware generator) instead of C double
to autogenerate cKDTrees for any NumPy dtype.
** Avoid using struct hacks to accomodate tree node structs of different
sizes (or else we are only compatible with C99).
** Start to release the GIL wherever possible, as searching a kd-tree is
expensive. (Not just for the sake of multi-threading, but also to avoid
locking up the interpreter forever.)
Sturla
On 13.07.2012 09:09, Ralf Gommers wrote:
>
>
> On Fri, Jul 13, 2012 at 1:34 AM, Patrick Varilly <patvarilly at gmail.com
> <mailto:patvarilly at gmail.com>> wrote:
>
> Alright, I've uploaded the last bit of cKDTree that was missing for
> it to be functionally equivalent to KDTree. As it stands, I think
> it's a useful addition in its own right, so it would be nice if
> someone else could look the code over and see if it can be merged in.
>
> Over the coming weeks, I will look into the issues that Sturla has
> brought up and see if I can make some progress on these.
>
>
> Please note that memoryviews can't be used yet in scipy due them not
> working with python 2.4 (see https://github.com/numpy/numpy/pull/307).
> All the other things Sturla mentioned do sound like improvements that
> can be made.
>
> Cheers,
> Ralf
>
>
>
> All the best,
>
> Patrick
>
>
> On Thu, Jul 12, 2012 at 5:42 PM, Sturla Molden <sturla at molden.no
> <mailto:sturla at molden.no>> wrote:
>
> On 12.07.2012 00:26, Patrick Varilly wrote:
> > On Tue, Jul 10, 2012 at 12:01 PM, Sturla Molden
> <sturla at molden.no <mailto:sturla at molden.no>
> > <mailto:sturla at molden.no <mailto:sturla at molden.no>>> wrote:
> >
> > At least cKDTree have to be fixed, it will break as soon
> as the move to
> > PyArray_DATA is mandatory.
> >
> > Preferably we should use Cython memoryviews and
> multidimensional arrays
> > in the code, instead of just C pointer artithmetics
> (which is harder to
> > understand). That will make the Cython code more readable
> to NumPy
> > users.
> >
> > The GIL issue should also be fixed, as searching might
> take a while.
> >
> > I'm relatively new to Cython. Could you tell me where I
> could read up
> > on these issues?
>
> The main issue is the use of the .data attribute. See here:
>
> http://wiki.cython.org/tutorials/NumpyPointerToC
>
> Another is that Cython's ndarray interface is (more or less)
> deprecated
> in favour of typed memoryviews:
>
> http://docs.cython.org/src/userguide/memoryviews.html
>
> So preferably the cKDTree code should use these, but I my experience
> they can generate compile-time warnings.
>
> There is also a 64-bit issue with cKDTree if I remember
> correctly. And
> the only dtype it supports is float64. We should replace the current
> pointer artimetics with multidimensional arrays. It had (or
> still has)
> non-portable code like dependency on unions and binary layout
> (tree and
> heap nodes). And there the issue of making it release the GIL
> whenever
> it should. So several things needs be fixed.
>
> Sturla
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org <mailto:SciPy-Dev at scipy.org>
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>
>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org <mailto:SciPy-Dev at scipy.org>
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>
>
>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
More information about the SciPy-Dev
mailing list