Erlang style processes for Python

Jacob Lee artdent at freeshell.org
Thu May 10 02:31:00 EDT 2007


On Wed, 09 May 2007 18:16:32 -0700, Kay Schluehr wrote:

> Every once in a while Erlang style [1] message passing concurrency [2]
> is discussed for Python which does not only imply Stackless tasklets [3]
> but also some process isolation semantics that lets the runtime easily
> distribute tasklets ( or logical 'processes' ) across physical
> processes. Syntactically a tasklet might grow out of a generator by
> reusing the yield keyword for sending messages:
> 
> yield_expr : 'yield' ([testlist] | testlist 'to' testlist)
> 
> where the second form is specific for tasklets ( one could also use  a
> new keyword like "emit" if this becomes confusing - the semantics is
> quite different ) and the addition of a new keyword for assigning the
> "mailbox" e.g:
> 
> required_stmt: 'required' ':' suite
> 
> So tasklets could be identified on a lexical level ( just like
> generators today ) and compiled accordingly. I just wonder about sharing
> semantics. Would copy-on-read / copy-on-write and new opcodes be needed?
> What would happen when sharing isn't dropped at all but when the runtime
> moves a tasklet around into another OS level thread / process it will be
> pickled and just separated on need? I think it would be cleaner to
> separate it completely but what are the costs?
> 
> What do you think?
> 
> [1] http://en.wikipedia.org/wiki/Erlang_programming_language [2]
> http://en.wikipedia.org/wiki/Actor_model [3] http://www.stackless.com/

Funny enough, I'm working on a project right now that is designed for
exactly that: PARLEY, http://osl.cs.uiuc.edu/parley . (An announcement
should show up in clp-announce as soon as the moderators release it). My
essential thesis is that syntactic sugar should not be necessary -- that a
nice library would be sufficient. I do admit that Erlang's pattern
matching would be nice, although you can get pretty far by using uniform
message formats that can easily be dispatched on -- the tuple
  (tag, sender, args, kwargs)
in the case of PARLEY, which maps nicely to instance methods of a
dispatcher class.

The questions of sharing among multiple physical processes is interesting.
Implicit distribution of actors may not even be necessary if it is easy
enough for two hosts to coordinate with each other. In terms of the
general question of assigning actors to tasklets, threads, and processes,
there are added complications in terms of the physical limitations of
Python and Stackless Python:
 - because of the GIL, actors in the same process do not gain the
 advantag of true parallel computation
 - all tasklet I/O has to be non-blocking
 - tasklets are cooperative, while threads are preemptive
 - communication across processes is slower, has to be serialized, etc.
 - using both threads and tasklets in a single process is tricky

PARLEY currently only works within a single process, though one can choose
to use either tasklets or threads. My next goal is to figure out I/O, at
which point I get to tackle the fun question of distribution.

So far, I've not run into any cases where I've wanted to change the
interpreter, though I'd be interested in hearing ideas in this direction
(especially with PyPy as such a tantalizing platform!).

-- 
Jacob Lee <artdent at freeshell.org>



More information about the Python-list mailing list