Interprocess communication and memory mapping

Thu Dec 15 14:09:01 EST 2005

James Aguilar wrote:
> Suppose that I am writing a ray tracer in Python.  Well, perhaps not a
> ray tracer.  Suppose that I am writing a ray tracer that has to update
> sixty times a second (Ignore for now that this is impossible and silly.
> Ignore also that one would probably not choose Python to do such a
> thing.).

Someone doesn't agree with you there... ;-)

http://www.pawfal.org/index.php?page=PyGmy

> Ray tracing, as some of you may know, is an inherently parallelizable task.
> Hence, I would love to split the task across my quad-core CPU (Ignore also that
> such things do not exist yet.). Because of GIL, I need all of my work to be done in
> separate processes.

Right. I suppose that you could just use the existing parallel
processing mechanisms for which Python interfaces exist. However, much
has been said about making multicore parallelism more accessible to the
average thread programmer, although much of that was said on the
python-dev mailing list [1], presumably because those doing most of the
talking clearly don't think of discussing such issues with the wider
community (and probably wanted to petition for core language changes as
well).

[...]

> * Is there any way to have Python objects (Such as a light or a color)
> put themselves into a byte array and then pull themselves out of the
> same array without any extra work?

Unless you mean something very special about "extra work", I would have
thought that the pickle module would cover this need.

[Other interesting questions about memory mapped files, pipes, shared
memory.]

My idea was to attempt to make use of existing multiprocessing
mechanisms, putting communications facilities on top. I don't know how
feasible or interesting that is, but what I wanted to do with the
pprocess module [2] was to develop an API using the POSIX fork system
call which resembled existing APIs for threading and communications. My
reasoning is that, as far as I know/remember, fork in modern POSIX
systems lets processes share read-only data - so like multithreaded
programs, each process shares the "context" of a computation with the
other computation units - whilst any modified data is held only by the
modifying process. With the supposed process migration capabilities of
certain operating systems, it should be possible to distribute
processes across CPUs and even computation nodes.

The only drawback is that one cannot, in a scheme as described above,
transparently modify global variables in order to communicate with
other processes. However, I consider it somewhat more desirable to
provide explicit communications channels for such communications, and
it is arguably a matter of taste as to how one then uses those
channels: either by explicitly manipulating channel objects, like
streams, or by wrapping them in such a way that a distributed
computation just looks like a normal function invocation.

Anyway, I don't have any formal experience in multiprocessing or any
multiprocessor/multicore environments available to me, so what I've
written may be somewhat naive, but should anything like it be workable,
it'd be a gentler path to parallelism than hacking Python's runtime to
remove the global interpreter lock.

Paul

[1]
http://mail.python.org/pipermail/python-dev/2005-September/056801.html
[2] http://www.python.org/pypi/parallel