2.6, 3.0, and truly independent intepreters

Tue Nov 4 09:38:18 EST 2008

On Nov 3, 7:11 pm, "Andy O'Meara" <and... at gmail.com> wrote:

> My hope was that the increasing interest and value associated with
> flexible, multi-core/"free-thread" support is at a point where there's
> a critical mass of CPython developer interest (as indicated by various
> serious projects specifically meant to offer this support).
> Unfortunately, based on the posts in this thread, it's becoming clear
> that the scale of code changes, design changes, and testing that are
> necessary in order to offer this support is just too large unless the
> entire community is committed to the cause.

I've been watching this debate from the side line.

First let me say that there are several solutions to the "multicore"
problem. Multiple independendent interpreters embedded in a process is
one possibility, but not the only. Unwillingness to implement this in
CPython does not imply unwillingness to exploit the next generation of
processors.

One thing that should be done, is to make sure the Python interpreter
and standard libraries release the GIL wherever they can.

The multiprocessing package has almost the same API as you would get
from your suggestion, the only difference being that multiple
processes is involved. This is however hidden from the user, and
(almost) hidden from the programmer.

Let see what multiprocessing can do:

- Independent interpreters? Yes.
- Shared memory? Yes.
- Shared (proxy) objects? Yes.
- Synchronization objects (locks, etc.)? Yes.
- IPC? Yes.
- Queues? Yes.
- API different from threads? Not really.

Here is one example of what the multiprocessing package can do,
written by yours truly:

http://scipy.org/Cookbook/KDTree

Multicore programming is also more than using more than one thread or
process. There is something called 'load balancing'. If you want to
make efficient use of more than one core, not only must the serial
algorithm be expressed as parallel, you must also take care to
distribute the work evenly. Further, one should avoid as much resource
contention as possible, and avoid races, deadlocks and livelocks.
Java's concurrent package has sophisticated load balancers like the
work-stealing scheduler in ForkJoin. Efficient multicore programming
needs other abstractions than the 'thread' object (cf. what cilk++ is
trying to do). It would certainly be possible to make Python do
something similar. And whether threads or processes is responsible for
the concurrency is not at all important. Today it it is easiest to
achieve multicore concurrency on CPython using multiple processes.

The most 'advanced' language for multicore programming today is
Erlang. It uses a 'share-nothing' message-passing strategy. Python can
do the same as Erlang using the Candygram package
(candygram.sourceforege.net). Changing the Candygram package to use
Multiprocessing instead of Python threads is not a major undertaking.

The GIL is not evil by the way. SBCL also has a lock that protects the
compiler. Ruby is getting a GIL.

So all it comes down to is this:

Why do you want multiple independent interpreters in a process, as
opposed to multiple processes?

Even if you did manage to embed multiple interpreters in a process, it
would not give the programmer any benefit over the multiprocessing
package. If you have multiple embedded interpreters, they cannot share
anything. They must communicate serialized objects or use proxy
objects. That is the same thing the multiprocessing package do.

So why do you want this particular solution?

S.M.