Parallelization on muli-CPU hardware?

Wed Oct 6 00:04:56 EDT 2004

Andreas Kostyrka wrote:
 > Aahz wrote:
 >>Alan Kennedy wrote:
 >>
 >>>I agree that it could potentially be a serious hindrance for cpython if
 >>>"multiple core" CPUs become commonplace. This is in contrast to jython
 >>>and ironpython, both of which support multiple-cpu parallelism.
 >>>
 >>>Although I completely accept the usual arguments offered in defense of
 >>>the GIL, i.e. that it isn't a problem in the great majority of use
 >>>cases, I think that position will become more difficult to defend as
 >>>desktop CPUs sprout more and more execution pipelines.

I have to agree with Alan.  The major chip vendors are telling
us that the future is multi-processor.

 >>Perhaps.  Then again, those pipelines will probably have their work cut
 >>out running firewalls, spam filters, voice recognition, and so on.  I
 >>doubt the GIL will make much difference, still.

I don't know of any quantitative demonstration either way, but
demands on computers tend to be bursty.  Frequently, a single
process has a lot of tasks that suddenly demanding computation,
while all other processes are quiet.  A GIL is a killer.

 > Actually I doubt that loosing the GIL would make that much performance
 > difference even on a real (not HT) 4-way box ->
 > What the GIL buys us is no lock contention and no locking overhead
 > with Python.
 >
 > And before somebody cries out, just think how the following trivial
 > statement would happen.
 >
 > a = b + c
 > Lock a
 > Lock b
 > Lock c
 > a = b + c
 > Unlock c
 > Unlock b
 > Unlock a

If that happens frequently, the program has blown it rather
badly.

 > So basically you either get a really huge number of locks (one per
 > object) with enough potential for conflicts, deadlocks and all the other
 > stuff to make it real slow down the ceval.
 >
 > One could use less granularity, and lock say the class of the object
 > involved, but that wouldn't help that much either.

There's a significant point there in that threads do well with
course granularity, and badly with finely parallel algorithms.
But if we look at how sensible multi-threaded programs work, we
see that the threads share little writable data.  Thread
contention is surprisingly rare, outside of identifiable hot-
spots around shared data-structures.

 > So basically the GIL is a design decision that makes sense, perhaps it
 > shouldn't be just called the GIL, call it the "very large locking
 > granularity design decision".
 >
 > And before somebody points out that other languages can use locks too.
 > Well, other languages have usually much lower level execution model than
 > Python. And they usually force the developer to deal with
 > synchronization primitives. Python OTOH had always the "no segmentation
 > fault" policy -> so locking would have be "safe" as in "not producing
 > segfaults". That's not trivial to implement, for example reference
 > counting isn't trivially implementable without locking (at least
 > portably).

It's not trivial to implement, but it is implementable, and is
so valuable that languages that do it are likely to antiquate
those that do not.  For example, Java strictly specified what
the programmer can assume and what he cannot.  Updates to 32-bit
integers are always atomic, updates to 64-bit integers are not.
The Java VM (with the Java libraries) does not crash because of
application-program threading mistakes; or at least if it does
everyone recognizes it as a major bug.  As presently
implemented, Java both allows more thread-concurrency than
Python, and is more thread-safe.

 > So, IMHO, there are basically the following design decisions:
 > GIL: large granularity
 > MSL: (many small locks) would slow down the overall execution of Python
 >      programs.
 > MSLu: (many small locks, unsafe) inacceptable because it would change
 >       Python experience ;)

One-grain-at-a-time is quickly becoming unacceptable regardless
of the granularity.

-- 
--Bryan