[SciPy-User] help speeding up a Runge-Kuta algorithm (cython, f2py, ...)
Sturla Molden
sturla at molden.no
Tue Aug 7 14:10:55 EDT 2012
On 07.08.2012 18:37, Ryan Krauss wrote:
> For many Runge-Kutta steps, your Cython code is 200 times faster than
> my pure Python version. Fortran is still 1.6 times faster than the
> Cython version, but the Fortran version is much more work to code up.
Don't expect anything to be "faster than Fortran" for certain kind of
numerical work. Cython has a certain overhead (larger than C and
Fortran), and since it compiles to ANSI C (not ISO C) we cannot restrict
pointers. But still, ~75% of Fortran performance is often acceptable!
Another thing is you need to look at "scalability". How much of that
extra runtime is constant due to differences between Cython and f2py?
How much is variable due to the numerical kernel being faster in
Fortran? Will differently sized problems give you the same overhead from
using Cython? It often helps to plot a graph of the performance (mean
and error bars) for various problem sizes, rather than benchmarking at
one single point.
Correctness is always more important than speed. That is one thing to
consider too. With Cython we can begin with a tested Python prototype
and optimize along the way, using the Python profiler to pinpoint where
it matters the most. Python, NumPy and Cython will not win the world
championship of being "fastest on the CPU" for simple numerical kernels,
but that is not the idea either. Implementing complex algorithms in
Fortran can be a PITA compared to Python. But Cython helps us in a
stright forward way to speed up Python code and/or interface with C or
C++. Fortran is only nice for helping us scientists to avoid the pointer
arithmetics of C, but Cython's memoryviews do that too.
Sturla
More information about the SciPy-User
mailing list