[Python-ideas] Type Hinting - Performance booster ?
Sturla Molden
sturla.molden at gmail.com
Mon Dec 22 13:05:48 CET 2014
On 22/12/14 00:20, Chris Angelico wrote:
> There may be something to that. Most of the people I've heard moaning
> that "Python is slow" aren't backing that up with any actual facts.
One fact is that the fastest Random Forests classifier known to man is
written in Python (with addition of some Cython in critical places). For
those who don't know, Random Fortests is one of the strongest algorithms
(if not the strongest) for estimating or predicting a probability p as a
non-linear function of N variables. Here are some interesting slides:
http://www.slideshare.net/glouppe/accelerating-random-forests-in-scikitlearn
Just look at what they could make Python and Cython do, compared to e.g.
the pure C++ solution in OpenCV.
Even more interesting, the most commonly used implementation of Random
Forests is the version for R, which is written in Fortran. There is
actually a pure Python version which is faster...
So who says Python is slow? I think today, mostly people who don't know
what they are talking about.
We see Python running at the biggest HPC systems today. We see Python
algorithms beating anything we can throw at it with C++ or Fortran. It
is time to realize that the flexibility of Python is not just a
slowdown, it is also a speed boost because it makes it easier to write
flexible and complex algorithms. But yes, Python needs help from Numba
or Cython, and sometimes even Fortran (f2py), to achieve its full
potential.
The main take-home lesson from those slides, by the way, is the need to
(1) profile Python code to identify bottlenecks -- humans are very bad
at this kind of guesswork -- and (2) the need to use code annotation in
Cython (compile with -a) to limit the Python overhead.
Sturla
More information about the Python-ideas
mailing list