2.6, 3.0, and truly independent intepreters

Thu Nov 6 18:22:46 EST 2008

On Nov 6, 2:03 pm, sturlamolden <sturlamol... at yahoo.no> wrote:
> On Nov 6, 6:05 pm, Walter Overby <walter.ove... at gmail.com> wrote:
>
> > I don't understand how this would help.  If these large data
> > structures reside only in one remote process, then the overhead of
> > proxying the data into another process for manipulation requires too
> > much IPC, or at least so Andy stipulates.
>
> Perhaps it will, or perhaps not. Reading or writing to a pipe has
> slightly more overhead than a memcpy. There are things that Python
> needs to do that are slower than the IPC. In this case, the real
> constraint would probably be contention for the object in the server,
> not the IPC. (And don't blame it on the GIL, because putting a lock
> around the object would not be any better.)

(I'm not blaming anything on the GIL.)

I read Andy to stipulate that the pipe needs to transmit "hundreds of
megs of data and/or thousands of data structure instances."  I doubt
he'd be happy with memcpy either.  My instinct is that contention for
a lock could be the quicker option.

And don't forget, he says he's got an "opaque OS object."  He asked
the group to explain how to send that via IPC to another process.  I
surely don't know how.

> > > 3. Go tohttp://pyro.sourceforge.net, download the code and read the
> > > documentation.
>
> > I don't see how this solves the problem with 2.
>
> It puts Python objects in shared memory. Shared memory is the fastest
> form of IPC there is. The overhead is basically zero. The only
> constraint will be contention for the object.

I don't think he has Python objects to work with.  I'm persuaded when
he says: "when you're talking about large, intricate data structures
(which include opaque OS object refs that use process-associated
allocators), even a shared memory region between the child process and
the parent can't do the job."

Why aren't you persuaded?

<snip>

> Yes, callbacks to Python are expensive. But is the problem the GIL?
> Instead of contention for the GIL, he seems to prefer contention for a
> complex object. Is that any better? It too has to be protected by a
> lock.

At a couple points, Andy has expressed his preference for a "single
high level sync object" to synchronize access to the data, at least
that's my reading.  What he doesn't seem to prefer is the slowdown
arising from the Python callbacks acquiring the GIL.  I think that
would be an additional lock, and that's near the heart of Andy's
concern, as I read him.

> > If I understand them correctly, none of these concerns are silly.
>
> No they are not. But I think he underestimates what multiple processes
> can do. The objects in 'multiprocessing' are already a lot faster than
> their 'threading' and 'Queue' counterparts.

Andy has complimented 'multiprocessing' as a "huge huge step."  He
just offers a scenario where multiprocessing might not be the best
solution, and so far, I see no evidence he is wrong.  That's not
underestimation, in my estimation!

Walter.