2.6, 3.0, and truly independent intepreters

sturlamolden sturlamolden at yahoo.no
Thu Nov 6 08:25:14 EST 2008


On Nov 5, 8:44 pm, "Andy O'Meara" <and... at gmail.com> wrote:

> In a few earlier posts, I went into details what's meant there:
>
> http://groups.google.com/group/comp.lang.python/browse_thread/thread/...http://groups.google.com/group/comp.lang.python/msg/edae2840ab432344http://groups.google.com/group/comp.lang.python/msg/5be213c31519217b
>

All this says is:

1. The cost of serialization and deserialization is to large.
2. Complex data structures cannot be placed in shared memory.

The first claim is unsubstantiated. It depends on how much and what
you serialize. If you use something like NumPy arrays, the cost of
pickling is tiny. Erlang is a language specifically designed for
concurrent programming, yet it does not allow anything to be shared.

The second claim is plain wrong. You can put anything you want in
shared memory. The mapping address of the shared memory segment may
vary, but it can be dealt with (basically use integers instead of
pointers, and use the base address as offset.) Pyro is a Python
project that has investigated this. With Pyro you can put any Python
object in a shared memory region. You can also use NumPy record arrays
to put very complex data structures in shared memory.

What do you gain by placing multiple interpreters in the same process?
You will avoid the complication that the mapping address of the shared
memory region may be different. But this is a problem that has been
worked out and solved. Instead you get a lot of issues dealing with
DLL loading and unloading (Python extension objects).

The multiprocessing module has something called proxy objects, which
also deals with this issue. An object is hosed in a server process,
and client processes may access it through synchronized IPC calls.
Inside the client process the remote object looks like any other
Python object. The synchronized IPC is hidden away in an abstraction
layer. In Windows, you can also construct outproc ActiveX objects,
which are not that different from multiprocessing's proxy objects.

If you need to place a complex object in shared memory:

1. Check if a NumPy record array may suffice (dtypes may be nested).
It will if you don't have dynamically allocated pointers inside the
data structure.

2. Consider using multiprocessing's proxy objects or outproc ActiveX
objects.

3. Go to http://pyro.sourceforge.net, download the code and read the
documentation.

Saying that "it can't be done" is silly before you have tried.
Programmers are not that good at guessing where the bottlenecks
reside, even if we think we do.








More information about the Python-list mailing list