Will Python 3.0 remove the global interpreter lock (GIL)

Thu Sep 20 10:58:03 EDT 2007

On 20 Sep 2007 07:43:18 -0700, Paul Rubin
<"http://phr.cx"@nospam.invalid> wrote:
> Steven D'Aprano <steve at REMOVE-THIS-cybersource.com.au> writes:
> > That's why your "comparatively wimpy site" preferred to throw extra web
> > servers at the job of serving webpages rather than investing in smarter,
> > harder-working programmers to pull the last skerricks of performance out
> > of the hardware you already had.
>
> The compute intensive stuff (image rendering and crunching) has
> already had most of those skerricks pulled out.  It is written in C
> and assembler (not by us).  Only a small part of our stuff is written
> in Python: it just happens to be the part I'm involved with.
>

That means that this part is also unaffected by the GIL.

> > But Python speed ups don't come for free. For instance, I'd *really*
> > object if Python ran twice as fast for users with a quad-core CPU, but
> > twice as slow for users like me with only a dual-core CPU.
>
> Hmm.  Well if the tradeoff were selectable at python configuration
> time, then this option would certainly be worth doing.  You might not
> have a 4-core cpu today but you WILL have one soon.
>
> > What on earth makes you think that would be anything more than a
> > temporary, VERY temporary, shutdown? My prediction is that the last of
> > the machines wouldn't have even been unplugged
>
> Of course that example was a reductio ad absurdum.  In reality they'd
> use the speedup to compute 2x as much stuff, rather than ever powering
> any servers down.  Getting the extra computation is more valuable than
> saving the electricity.  It's just easier to put a dollar value on
> electricity than on computation in an example like this.  It's also
> the case for our specfiic site that our server cluster is in large
> part a disk farm and not just a compute farm, so even if we sped up
> the software infinitely we'd still need a lot of boxes to bolt the
> disks into and keep them spinning.
>

I think this is instructive, because it's pretty typical of GIL
complaints. Someone gives an example where the GIL is limited, but
upon inspection it turns out that the actual bottleneck is elsewhere,
that the GIL is being sidestepped anyway, and that the supposed
benefits of removing the GIL wouldn't materialize because the problem
space isn't really as described.

> > Now there's a thought... given that Google:
> >
> > (1) has lots of money;
> > (2) uses Python a lot;
> > (3) already employs both Guido and (I think...) Alex Martelli and
> > possibly other Python gurus;
> > (4) is not shy in investing in Open Source projects;
> > (5) and most importantly uses technologies that need to be used across
> > multiple processors and multiple machines
> >
> > one wonders if Google's opinion of where core Python development needs to
> > go is the same as your opinion?
>
> I think Google's approach has been to do cpu-intensive tasks in other
> languages, primarily C++.  It would still be great if they put some
> funding into PyPy development, since I think I saw something about the
> EU funding being interrupted.
> --

At the really high levels of scalability, such as across a server
farm, threading is useless. The entire point of threads, rather than
processes, is that you've got shared, mutable state. A shared nothing
process (or Actor, if you will) model is the only one that makes sense
if you really want to scale because it's the only one that allows you
to distribute over machines. The fact that it also scales very well
over multiple cores (better than threads, in many cases) is just
gravy.

The only hard example I've seen given of the GIL actually limiting
scalability is on single server, high volume Django sites, and I don't
think that the architecture of those sites is very scalable anyway.