Parallelization on muli-CPU hardware?

Mon Oct 11 16:25:23 EDT 2004

aahz at pythoncraft.com (Aahz) wrote in message news:<ck3ls6$62e$1 at panix1.panix.com>...
> 
> One critical reason for the GIL is to support CPython's ability to call
> random C libraries with little effort.  Too many C libraries are not
> thread-safe, let alone thread-hot.  

Very true and I think that's one of the main reasons we still have the
GIL.

> Forcing libraries that wish to
> participate in threading to use Python's GIL-release mechanism is the
> only safe approach.

The ONLY one?  That's too strong a statement unless you qualify it. 
One problem is that CPython tries to make C library development as
easy as possible to the point of being condescending.  IOW, it seems
that CPython assumes that most C library writers are too stupid to
write thread-safe code.  However, one could say that if it had always
forced all C libraries to be written thread-safe from scratch, most
library developers would've practiced a lot by now.

Given enough time, familiarity is a good tool to tame apparently
complex issues.

Unfortunately, it's not that simple or maybe not even true.  I suspect
that the real reason C libraries are not required to be thread-safe is
that there's already too many C libraries out there that are NOT
thread safe and too many people think (arguably mistaken) that fixing
all those libaries to get rid of the GIL is not worth the trouble.

I speculate that the GIL "problem" might have grown to its current
state because CPython grew in parallel with thread programming. 
CPython existed way before Posix 1003c was ratified or even before we
had good pthread libraries.  In addition, CPython always wanted to be
OS agnostic.  In 1995, it would've been possible to write a
CPosixPython with full support for the Posix thread library with
support for all those cool things like CPU affinity, options for
kernel or user threads, thread cancellation, thread local storage,
etc. but it would not have worked on any OS that didn't have a good,
conformat pthreads library, which at that time were many.

Therefore, fixing the GIL was always in competition with existing code
and libraries, and it always lost because it was (and still is)
considered "not worth the effort."

If we were going to write a CPython interpreter today, it would be a
lot easier to write the interpreter loop and its data in a manner that
would maximize the use of threads because today thread support is
widely spread among supported OSes.

As a kernel developer and experience pthreads programmer, the first
time I saw CPython's main interpreter loop in ceval.c my jaw hit the
floor because I couldn't believe why anybody would write a threaded
interpreter that grabs ONE BIG mutex, runs 100 op codes (checkinterval
default), and then releases the mutex.  It took a while to finally
understand that the reason is not technical but mostly historical
(IMHO).

I've said it before.  One day enough people will think that the GIL is
a problem big enough to warrant a solution, e.g., when the majority of
systems where CPython runs have more than one CPU.  Until then we have
to go back to early 90s programming and use IPC (interprocess
communication) to scale applications that want to run PURE python code
on more than one CPU.  That's probably the main disagreement I have
with those that think that the GIL is not a big problem, IPC is not a
solution but a workaround.

This is not a Python unique problem either.  Most major UNIX OSes,
e.g., HPUX, Sun, AIX, etc. went through the same thing when adding
support for SMP in early through mid 90s.  And that was a bigger
problem that CPython's GIL because they had a lot drivers that were
not SMP safe.

I know the problem is complex and there are other non-technical issues
to consider.  However, and I don't mean to oversimplify, here are a
few ideas that might help with the GIL problem:

  - Detect multiple CPUs, if single CPU, do not lock globals.  That's
what some OSes do to avoid unnecessary SMP penalties on single CPU
systems.

  - Create python objects in thread local storage by default, which
don't need locking.

  - Rewrite the interpreter loop so that it doesn't grab a BIG lock,
unless configured to do so.

  - Let users lock, not the interpreter.  If you use threads and you
access global objects without locking, you might get undefined
behaviors, just like in C/C++.

  - Allow legacy behavior with a command line option to enable the GIL
and create global objects.  (or viceversa).

  - Require C libraries to tell the interpreter if they are thread
safe or not.  If they are not, the interpreter would not load them
unless running in legacy-mode.

  - Augment the posix module to include full support for the pthreads
library, e.g., thread cancellation, CPU affinity.

Of course, that's easier said than done and I'm not saying that it can
or should be done now.  The point is that getting rid of the GIL is
straightforward, it's been done before many times, but it will not
happen until it's widely viewed as a problem.

Unfortunately, those changes are big enough that I don't think they'll
happen under the CPython source tree even we all wanted them.  More
than likely it will require a separate project, PosixPython perhaps?

I hope one day I'll work for an employer that could afford to donate
some (or all) my time for python development so I could start working
on that.  Now, that would be cool.

With a <sigh> and a <wink>,

--
Luis P Caamano