[SciPy-dev] Inclusion of cython code in scipy
Stéfan van der Walt
stefan at sun.ac.za
Thu Apr 24 06:39:17 EDT 2008
2008/4/24 Prabhu Ramachandran <prabhu at aero.iitb.ac.in>:
> Lets take a simple case of someone wanting to handle a growing
> collection of say a million particles and do something to them. How do
> you do that in cython/pyrex and get the performance of C and interface
> to numpy? Worse, even if it were possible, you'll still need to know
> something about allocating memory in C and manipulating pointers. I can
> do that with C++ and SWIG today.
That's the point: you, being a well-established programmer can do it
easily, but most Python programmers would struggle doing that through
some C or C++ API. I think this would be pretty easy to do in Cython:
1. Write a function, say create_workspace(nr_elements), that creates a
new ndarray and returns it:
cdef ndarray results_arr = np.empty((nr_elements,), dtype=np.double)
2. Grab a pointer to the memory (this should become a lot easier after
GSOC 2008):
cdef double* results = <double*>results_arr.data
3. Run your loop in which you produce data points. The moment you
have more results than
the output array can hold, call create_workspace(current_size**2), and
use normal numpy indexing to copy the old results to the new location:
new_results_arr[:current_size] = old_results_arr
4. Rinse and repeat
The beauty of the Cython approach is that you
a) Never have to worry about INCREF and DECREF
b) Can use Python calls within C functions. You don't want to do that
in your fast inner loop, but take the example above: we only copy
arrays infrequently, and then we'd like to have the full power of
numpy indexing. Suddenly, sorting, averaging, summing becomes a
one-liner, just like in Python, at the expense of one Python call (and
this won't affect execution time in the above example).
c) Debug in a much cleaner way than C++ or C code: fewer memory leaks,
introspection of source etc.
Cheers
Stéfan
More information about the SciPy-Dev
mailing list