Future of Pypy?

Steven D'Aprano steve+comp.lang.python at pearwood.info
Sun Feb 22 20:18:43 EST 2015


Paul Rubin wrote:

> Laura Creighton <lac at openend.se> writes:
>> Because one thing we do know is that people who are completely and
>> utterly ignorant about whether having multiple cores will improve
>> their code still want to use a language that lets them use the
>> multiple processors.  If the TM dream of having that just happen,
>> seemlessly (again, no promises) is proven to be true, well ....  we
>> think that the hordes will suddenly be interested in PyPy.
> 
> TM is a useful feature but it's unlikely to be the thing that attracts
> "the hordes".  More important is to eliminate the GIL 

*rolls eyes*

I'm sorry, but the instant somebody says "eliminate the GIL", they lose
credibility with me. Yes yes, I know that in *your* specific case you've
done your research and (1) multi-threaded code is the best solution for
your application and (2) alternatives aren't suitable.

Writing multithreaded code is *hard*. It is not a programming model which
comes naturally to most human beings. Very few programs are inherently
parallelizable, although many programs have *parts* which can be
successfully parallelized. 

I think that for many people, "the GIL" is just a bogeyman, or is being
blamed for their own shortcomings. To take an extreme case, if you're
running single-thread code on a single-core machine and still complaining
about the GIL, you have no clue.

(That's not *you personally* Paul, it's a generic "you".)

There are numerous alternatives for those who are genuinely running into
GIL-related issues. Jeff Knupp has a good summary:

http://www.jeffknupp.com/blog/2013/06/30/pythons-hardest-problem-revisited/

One alternative that he misses is that for some programs, the simplest way
to speed it up is to vectorize the core parts of your code by using numpy.
No threads needed.

For those who think that the GIL and the GIL alone is the problem, consider
that Jython is nearly as old as CPython, it goes back at least 15 years.
IronPython has been around for a long time too, and is possibly faster than
CPython even in single threaded code. Neither has a GIL. Both are mature
implementations, built on well-known, powerful platforms with oodles of
business credibility (the JVM and .Net). IronPython even has the backing of
Microsoft, it is one of the few non-Microsoft languages with a privileged
position in the .Net ecosystem.

Where are the people flocking to use Jython and IronPython?

In fairness, there are good reasons why some people cannot use Jython or
IronPython, or one of the other alternatives. But that demonstrates that
the problem is more complex than just "the GIL".

For removal of the GIL to really make a difference:

- you must have at least two cores (that, at least, applies to most people
these days);

- you must be performing a task which is parallelizable and not inherently
sequential (no point using multiple threads if each thread spends all its
time waiting for the previous thread);

- the task must be one that moving to some other multi-processing model
(such as greenlets, multiprocess, etc.) is infeasible;

- you must actually use multiple threads, and use them properly (no busy
wait loops);

- your threading bottleneck must be primarily CPU-bound, not I/O bound
(CPython's threads are already very effective at parallelising I/O tasks);

- and you must be using libraries and tools which prevent you moving to
Jython or IronPython or some other alternative.

I can't help but feel that the set of people for whom removal of the GIL
would actually help is much smaller than, and different to, the set of
people who complain about the GIL.



-- 
Steven




More information about the Python-list mailing list