problem in implementing multiprocessing

Mon Jan 19 06:52:47 EST 2009

On Jan 19, 3:09 am, Carl Banks <pavlovevide... at gmail.com> wrote:
snip
> Since multiprocessing serializes and deserializes the data while
> passing
> it from process to process, passing very large objects would have a
> very
> high latency and overhead.  IOW, gopal's diagnosis is correct.  It's
> just not practical to share very large objects among seperate
> processes.

You could pass composite objects back and forth by passing pieces back
and forth.  You'd have to construct it so as not to need access to the
entire data structure in any one piece; that is, only need access to
other small pieces.

> For simple data like large arrays of floating point numbers, the data
> can be shared with an mmaped file or some other memory-sharing scheme,
> but actual Python objects can't be shared this way.  If you have
> complex
> data (networks and heirarchies and such) it's a lot harder to share
> this
> information among processes.

It wouldn't hurt to have a minimal set of Python objects that are
'persistent live', that is, stored out of memory in their native
form.  The only problem is, they can't contain references to volatile
objects.  (I don't believe POSH addresses this.)