2.6, 3.0, and truly independent intepreters

Jesse Noller jnoller at gmail.com
Thu Oct 30 13:00:06 EDT 2008


On Thu, Oct 30, 2008 at 12:05 PM, Andy O'Meara <andy55 at gmail.com> wrote:
> On Oct 28, 6:11 pm, "Martin v. Löwis" <mar... at v.loewis.de> wrote:
>> > Because then we're back into the GIL not permitting threads efficient
>> > core use on CPU bound scripts running on other threads (when they
>> > otherwise could).
>>
>> Why do you think so? For C code that is carefully written, the GIL
>> allows *very well* to write CPU bound scripts running on other threads.
>> (please do get back to Jesse's original remark in case you have lost
>> the thread :-)
>>
>
> I don't follow you there.  If you're referring to multiprocessing, our
> concerns are:
>
> - Maturity (am I willing to tell my partners and employees that I'm
> betting our future on a brand-new module that imposes significant
> restrictions as to how our app operates?)
> - Liability (am I ready to invest our resources into lots of new
> python module-specific code to find out that a platform that we want
> to target isn't supported or has problems?).  Like it not, we're a
> company and we have to show sensitivity about new or fringe packages
> that make our codebase less agile -- C/C++ continues to win the day in
> that department.
> - Shared memory -- for the reasons listed in my other posts, IPC or a
> shared/mapped memory region doesn't work for our situation (and I
> venture to say, for many real world situations otherwise you'd see end-
> user/common apps use forking more often than threading).
>

FWIW (and again, I am not saying MP is good for your problem domain) -
multiprocessing works on windows, OS/X, Linux and Solaris quite well.
The only platforms it has problems on right now *BSD and AIX. It has
plenty of tests (I want more more more) and has a decent amount of
usage is my mail box and bug list are any indication.

Multiprocessing is not *new* - it's a branch of the pyprocessing package.

Multiprocessing is written in C, so as for the "less agile" - I don't
see how it's any less agile then what you've talked about. If you
wanted true platform insensitivity, then Java is a better bet :) As
for your final point:

> - Shared memory -- for the reasons listed in my other posts, IPC or a
> shared/mapped memory region doesn't work for our situation (and I
> venture to say, for many real world situations otherwise you'd see end-
> user/common apps use forking more often than threading).
>

I philosophically disagree with you here. PThreads and Shared memory
as it is today, is largely based on Java's influence on the world. I
would argue that the reason most people use threads as opposed to
processes is simply based on "ease of use and entry" (which is ironic,
given how many problems it causes). Not because they *need* the shared
memory aspects of it, or because they could not decompose the problem
into Actors/message passing, but because threads:

A> are there (e.g. in Java, Python, etc)
B> allow you to "share anything" (which allows you to take horrible shortcuts)
C> is what everyone "knows" at this point.

Even luminaries such as Brian Goetz and many, many others have pointed
out that threading, as it exists today is fundamentally difficult to
get right. Ergo the "renaissance" (read: echo chamber) towards
Erlang-style concurrency.

For many "real world" applications - threading is just "simple". This
is why Multiprocessing exists at all - to attempt to make forking/IPC
as "simple" as the API to threading. It's not foolproof, but the goal
was to open the door to multiple cores with a familiar API:

Quoting PEP 371:

"The pyprocessing package offers a method to side-step the GIL
    allowing applications within CPython to take advantage of
    multi-core architectures without asking users to completely change
    their programming paradigm (i.e.: dropping threaded programming
    for another "concurrent" approach - Twisted, Actors, etc).

    The Processing package offers CPython a "known API" which mirrors
    albeit in a PEP 8 compliant manner, that of the threading API,
    with known semantics and easy scalability."

I would argue that most of the people taking part in this discussion
are working on "real world" applications - sure, multiprocessing as it
exists today, right now - may not support your use case, but it was
evaluated to fit *many* use cases.

Most of the people here are working in Pure python, or they're using a
few extension modules here and there (in C). Again, when you say
threads and processes, most people here are going to think "import
threading", "fork()" or "import multiprocessing"

Please correct me if I am wrong in understanding what you want: You
are making threads in another language (not via the threading API),
embed python in those threads, but you want to be able to share
objects/state between those threads, and independent interpreters. You
want to be able to pass state from one interpreter to another via
shared memory (e.g. pointers/contexts/etc).

Example:

ParentAppFoo makes 10 threads (in C)
Each thread gets an itty bitty python interpreter
ParentAppFoo gets a object(video) to render
Rather then marshal that object, you pass a pointer to the object to
the children
You want to pass that pointer to an existing, or newly created itty
bitty python interpreter for mangling
Itty bitty python interpreter passes the object back to a C module via
a pointer/context

If the above is wrong, I think possible outlining it in the above form
may help people conceptualize it - I really don't think you're talking
about python-level processes or threads.

-jesse



More information about the Python-list mailing list