2.6, 3.0, and truly independent intepreters

Fri Oct 24 20:59:52 EDT 2008

On Fri, Oct 24, 2008 at 4:48 PM, Glenn Linderman <v+python at g.nevcal.com> wrote:
> On approximately 10/24/2008 2:15 PM, came the following characters from the
> keyboard of Rhamphoryncus:
>>
>> On Oct 24, 2:59 pm, Glenn Linderman <gl... at nevcal.com> wrote:
>>
>>>
>>> On approximately 10/24/2008 1:09 PM, came the following characters from
>>> the keyboard of Rhamphoryncus:
>>>
>>>>
>>>> PyE: objects are reclassified as shareable or non-shareable, many
>>>> types are now only allowed to be shareable.  A module and its classes
>>>> become shareable with the use of a __future__ import, and their
>>>> shareddict uses a read-write lock for scalability.  Most other
>>>> shareable objects are immutable.  Each thread is run in its own
>>>> private monitor, and thus protected from the normal threading memory
>>>> module nasties.  Alas, this gives you all the semantics, but you still
>>>> need scalable garbage collection.. and CPython's refcounting needs the
>>>> GIL.
>>>>
>>>
>>> Hmm.  So I think your PyE is an instance is an attempt to be more
>>> explicit about what I said above in PyC: PyC threads do not share data
>>> between threads except by explicit interfaces.  I consider your
>>> definitions of shared data types somewhat orthogonal to the types of
>>> threads, in that both PyA and PyC threads could use these new shared
>>> data items.
>>>
>>
>> Unlike PyC, there's a *lot* shared by default (classes, modules,
>> function), but it requires only minimal recoding.  It's as close to
>> "have your cake and eat it too" as you're gonna get.
>>
>
> Yes, but I like my cake frosted with performance; Guido's non-acceptance of
> granular locks in the blog entry someone referenced was due to the slowdown
> acquired with granular locking and shared objects.  Your PyE model, with
> highly granular sharing, will likely suffer the same fate.

No, my approach includes scalable performance.  Typical paths will
involve *no* contention (ie no locking).  classes and modules use
shareddict, which is based on a read-write lock built into the
interpreter, so it's uncontended for read-only usage patterns.  Pretty
much everything else is immutable.

Of course that doesn't include the cost of garbage collection.
CPython's refcounting can't scale.

> The independent threads model, with only slight locking for a few explicitly
> shared objects, has a much better chance of getting better performance
> overall.  With one thread running, it would be the same as today; with
> multiple threads, it should scale at the same rate as the system... minus
> any locking done at the higher level.

So use processes with a little IPC for these expensive-yet-"shared"
objects.  multiprocessing does it already.

>>> I think/hope that you meant that "many types are now only allowed to be
>>> non-shareable"?  At least, I think that should be the default; they
>>> should be within the context of a single, independent interpreter
>>> instance, so other interpreters don't even know they exist, much less
>>> how to share them.  If so, then I understand most of the rest of your
>>> paragraph, and it could be a way of providing shared objects, perhaps.
>>>
>>
>> There aren't multiple interpreters under my model.  You only need
>> one.  Instead, you create a monitor, and run a thread on it.  A list
>> is not shareable, so it can only be used within the monitor it's
>> created within, but the list type object is shareable.
>>
>
> The python interpreter code should be sharable, having been written in C,
> and being/becoming reentrant.  So in that sense, there is only one
> interpreter.  Similarly, any other reentrant C extensions would be that way.
>  On the other hand, each thread of execution requires its own interpreter
> context, so that would have to be independent for the threads to be
> independent.  It is the combination of code+context that I call an
> interpreter, and there would be one per thread for PyC threads.  Bytecode
> for loaded modules could potentially be shared, if it is also immutable.
>  However, that could be in my mental "phase 2", as it would require an extra
> level of complexity in the interpreter as it creates shared bytecode...
> there would be a memory savings from avoiding multiple copies of shared
> bytecode, likely, and maybe also a compilation performance savings.  So it
> sounds like a win, but it is a win that can deferred for initial simplicity,
> to prove the concept is or is not workable.
>
> A monitor allows a single thread to run at a time; that is the same
> situation as the present GIL.  I guess I don't fully understand your model.

To use your terminology, each monitor is a context.  Each thread
operates in a different monitor.  As you say, most C functions are
already thread-safe (reentrant).  All I need to do is avoid letting
multiple threads modify a single mutable object (such as a list) at a
time, which I do by containing it within a single monitor (context).

-- 
Adam Olsen, aka Rhamphoryncus