2.6, 3.0, and truly independent intepreters

Fri Oct 24 12:32:33 EDT 2008

On Fri, Oct 24, 2008 at 12:30 PM, Jesse Noller <jnoller at gmail.com> wrote:
> On Fri, Oct 24, 2008 at 10:40 AM, Andy O'Meara <andy55 at gmail.com> wrote:
>>> > 2) Barriers to "free threading".  As Jesse describes, this is simply
>>> > just the GIL being in place, but of course it's there for a reason.
>>> > It's there because (1) doesn't hold and there was never any specs/
>>> > guidance put forward about what should and shouldn't be done in multi-
>>> > threaded apps
>>>
>>> No, it's there because it's necessary for acceptable performance
>>> when multiple threads are running in one interpreter. Independent
>>> interpreters wouldn't mean the absence of a GIL; it would only
>>> mean each interpreter having its own GIL.
>>>
>>
>> I see what you're saying, but let's note that what you're talking
>> about at this point is an interpreter containing protection from the
>> client level violating (supposed) direction put forth in python
>> multithreaded guidelines.  Glenn Linderman's post really gets at
>> what's at hand here.  It's really important to consider that it's not
>> a given that python (or any framework) has to be designed against
>> hazardous use.  Again, I refer you to the diagrams and guidelines in
>> the QuickTime API:
>>
>> http://developer.apple.com/technotes/tn/tn2125.html
>>
>> They tell you point-blank what you can and can't do, and it's that's
>> simple.  Their engineers can then simply create the implementation
>> around those specs and not weigh any of the implementation down with
>> sync mechanisms.  I'm in the camp that simplicity and convention wins
>> the day when it comes to an API.  It's safe to say that software
>> engineers expect and assume that a thread that doesn't have contact
>> with other threads (except for explicit, controlled message/object
>> passing) will run unhindered and safely, so I raise an eyebrow at the
>> GIL (or any internal "helper" sync stuff) holding up an thread's
>> performance when the app is designed to not need lower-level global
>> locks.
>>
>> Anyway, let's talk about solutions.  My company looking to support
>> python dev community endeavor that allows the following:
>>
>> - an app makes N worker threads (using the OS)
>>
>> - each worker thread makes its own interpreter, pops scripts off a
>> work queue, and manages exporting (and then importing) result data to
>> other parts of the app.  Generally, we're talking about CPU-bound work
>> here.
>>
>> - each interpreter has the essentials (e.g. math support, string
>> support, re support, and so on -- I realize this is open-ended, but
>> work with me here).
>>
>> Let's guesstimate about what kind of work we're talking about here and
>> if this is even in the realm of possibility.  If we find that it *is*
>> possible, let's figure out what level of work we're talking about.
>> >From there, I can get serious about writing up a PEP/spec, paid
>> support, and so on.
>
> Point of order! Just for my own sanity if anything :) I think some
> minor clarifications are in order.
>
> What are "threads" within Python:
>
> Python has built in support for POSIX light weight threads. This is
> what most people are talking about when they see, hear and say
> "threads" - they mean Posix Pthreads
> (http://en.wikipedia.org/wiki/POSIX_Threads) this is not what you
> (Adam) seem to be asking for. PThreads are attractive due to the fact
> they exist within a single interpreter, can share memory all "willy
> nilly", etc.
>
> Python does in fact, use OS-Level pthreads when you request multiple threads.
>
> The Global Interpreter Lock is fundamentally designed to make the
> interpreter easier to maintain and safer: Developers do not need to
> worry about other code stepping on their namespace. This makes things
> thread-safe, inasmuch as having multiple PThreads within the same
> interpreter space modifying global state and variable at once is,
> well, bad. A c-level module, on the other hand, can sidestep/release
> the GIL at will, and go on it's merry way and process away.
>
> POSIX Threads/pthreads/threads as we get from Java, allow unsafe
> programming styles. These programming styles are of the "shared
> everything deadlock lol" kind. The GIL *partially* protects against
> some of the pitfalls. You do not seem to be asking for pthreads :)
>
> http://www.python.org/doc/faq/library/#can-t-we-get-rid-of-the-global-interpreter-lock
> http://en.wikipedia.org/wiki/Multi-threading
>
> However, then there are processes.
>
> The difference between threads and processes is that they do *not
> share memory* but they can share state via shared queues/pipes/message
> passing - what you seem to be asking for - is the ability to
> completely fork independent Python interpreters, with their own
> namespace and coordinate work via a shared queue accessed with pipes
> or some other communications mechanism. Correct?
>
> Multiprocessing, as it exists within python 2.6 today actually forks
> (see trunk/Lib/multiprocessing/forking.py) a completely independent
> interpreter per process created and then construct pipes to
> inter-communicate, and queue to do work coordination. I am not
> suggesting this is good for you - I'm trying to get to exactly what
> you're asking for.
>
> Fundamentally, allowing total free-threading with Posix threads, using
> the same Java-Model for control is a recipe for pain - we're just
> repeating mistakes instead of solving a problem, ergo - Adam Olsen's
> work. Monitors, Actors, etc have all been discussed, proposed and are
> being worked on.
>
> So, just to clarify - Andy, do you want one interpreter, $N threads
> (e.g. PThreads) or the ability to fork multiple "heavyweight"
> processes?
>
> Other bits for reading:
> http://www.boddie.org.uk/python/pprocess.html (as an alternative the
> multiprocessing)
> http://smparkes.net/tag/dramatis/
> http://osl.cs.uiuc.edu/parley/
> http://candygram.sourceforge.net/
>

I almost forgot:

http://www.kamaelia.org/Home