[Python-ideas] Type Hinting - Performance booster ?

Sturla Molden sturla.molden at gmail.com
Mon Dec 22 13:05:48 CET 2014


On 22/12/14 00:20, Chris Angelico wrote:

> There may be something to that. Most of the people I've heard moaning
> that "Python is slow" aren't backing that up with any actual facts.

One fact is that the fastest Random Forests classifier known to man is 
written in Python (with addition of some Cython in critical places). For 
those who don't know, Random Fortests is one of the strongest algorithms 
(if not the strongest) for estimating or predicting a probability p as a 
non-linear function of N variables. Here are some interesting slides:

http://www.slideshare.net/glouppe/accelerating-random-forests-in-scikitlearn

Just look at what they could make Python and Cython do, compared to e.g. 
the pure C++ solution in OpenCV.

Even more interesting, the most commonly used implementation of Random 
Forests is the version for R, which is written in Fortran. There is 
actually a pure Python version which is faster...

So who says Python is slow? I think today, mostly people who don't know 
what they are talking about.

We see Python running at the biggest HPC systems today. We see Python 
algorithms beating anything we can throw at it with C++ or Fortran. It 
is time to realize that the flexibility of Python is not just a 
slowdown, it is also a speed boost because it makes it easier to write 
flexible and complex algorithms. But yes, Python needs help from Numba 
or Cython, and sometimes even Fortran (f2py), to achieve its full 
potential.

The main take-home lesson from those slides, by the way, is the need to 
(1) profile Python code to identify bottlenecks -- humans are very bad 
at this kind of guesswork -- and (2) the need to use code annotation in 
Cython (compile with -a) to limit the Python overhead.

Sturla



More information about the Python-ideas mailing list