Help with coroutine-based state machines?

Tue Jun 3 05:26:55 EDT 2003

[Alan spends the weekend upgrading an old laptop, so he can look at
this stuff in the comfort of the garden, which was bathed in glorious
sunshine for the entirety of the Irish Bank Holiday weekend B-). He
also prints and reads the writings of Christian Tismer[1], David
Mertz[2,3] and the Timbot[4,5], in relation to generators,
coroutines, continuations, pseudo/weightless/micro-threads, etc].

Steven Taschuk wrote:

[
 Lots of great stuff which illustrates the problems of calling from
 generator/coroutine code to functional code and vice versa.
]

I now understand. Thanks for taking the time to explain.

Thanks to your clear examples, I now picture coroutines in the
following way (which I hope is correct :-).

1. The "traditional" function/method call paradigm builds a stack
frame every time a function/method is called. Since this stack frame
holds the references to the local variables for the function/method,
it must be destroyed and collected if memory is not to be "leaked".
The only time when the stack frame can be destroyed is when the
references it contains are no longer required: i.e. after the
instance of the function/method call has finished.

2. The only method by which the "traditional" function call can
invoke another "traditional" function/method call is to call it
"recursively", thus causing the construction of another stack frame.
There is no mechanism for it to "substitute" the stack frame of the
called function "in place of" its own stack frame. (Although I
believe that Stackless can do this, after much rewriting and
reinvention of truth :-).

3. Because all "traditional" function/method calls, involve the
creation and eventual destruction of a stack frame, I label the space
in which they operate "Linear stack space".

4. There is another space, which I call "Generator space". When a
call is made into "Generator space", a stack frame is not
constructed: at least not every time. Instead, a resumable and
persistent stack frame is "resumed": this "resumable stack frame" was
created once, sometime in the past: it is termed, in current python
terminology, the generator-iterator.

5. When the code in "Generator space" (i.e. the generator-iterator)
is called, it resumes immediately after where it last 'yield'ed, and
continues until it 'yield's again, or returns or excepts. When it
'yield's, two things happen. A: The resumable stack frame is
"frozen", so that it can later be resumed again. and B: A python
object, which may be None, is transferred back to the caller.

6. For any call from "Linear Stack space" into "Generator space",
there is a crossover point, P. When the called code in "Generator
space" finishes executing, it can only enter back into "Linear stack
space" through point P: it cannot exit through any other point.
(According to the current python model).

7. If any calls are made from "Generator space" into "Linear stack
space", this leads to the creation of a stack frame, which must be
destroyed. If the called function/method in "Linear stack space" then
calls back into "Generator space", and the "Linear space" function is
not allowed to exit naturally, this is essentially unbound recursion,
which will lead eventually to a blown stack. (The non-functioning
Producer/Consumer example illustrates this: the code in "Generator
space" could be made to work by "traditionally" calling the write and
read functions, but this mutual recursion would eventually blow the
"Linear space" stack).

8. When a stack frame in "Generator space" 'yield's, it is possible
to adopt a convention for the 'yield'ed value: the "dispatcher"
convention. Under this convention, the value 'yield'ed can be a
reference to another resumable stack frame in Generator space. A
special "dispatch"-function is coded to recognise these references,
and to immediately call back into "Generator space". The newly
resumed stack frame is still limited by the same constraints as the
original call into "Generator space": it must eventually return to
"Linear stack space" through point P, in this case the dispatcher
function.

9. A complex network of interacting "resumable stack frames", or
generators, can mutually 'yield' control to each other, assuming the
cooperation of the dispatcher function. Execution continually
"bounces" back and forth between the dispatch function (Point P, in
"Linear stack space") and various 'yield'ed resumable states in
"Generator space".

10. A network of interacting generator-iterators could potentially
execute much faster than an equivalent series of "traditional"
function/method calls in "Linear stack space", since there is no
overhead associated with con/destruction of stack frames, no parsing of
parameters, etc. Such a network of interacting generators are called
"coroutines", and require the presence of a dispatcher function: the
dispatcher function must be explicitly created by the programmer, as
python currently exists.

10a: refinement: In fact python generators are really
"semi"-coroutines, because of the requirement to keep 'yield'ing to a
dispatcher function. In a language which implemented
"full"-coroutines (maybe a __future__ version of python?), the
dispatcher would not be required (at least explicitly), and resumable
states could transfer control directly to any other resumable state
for which they hold a reference.

11. The efficiency benefits of generators can be also realised by
*not* adopting the dispatcher convention. Instead, a generator can
simply 'yield' the series of values it has been coded to generate:
the nodes in a data structure, the numbers in a computed series, etc.
This can lead to more natural code, particularly for the calling
function which utilises the series of values 'yield'ed: it is mostly
unaware that there is a resumable state involved in the generation of
the values. Simple example:

def squares(n):
    for i in xrange(n):
        yield n, n*n

# Instantiate a generator-iterator
sqgen = squares(100)
# We can do the following because the generator-iterator
# conventiently implements the iterator protocol.
for i, sq in sqgen():
    print "The square of %d is %d" % (i, sq)

12. It is possible to avoid the "mutual" recursion problem of calling
from "Generator space" into "Linear stack space", by the following
actions

 o Moving all "traditional" function/method calls required from
   "Linear stack space" into "Generator space".
 o By expanding the "dispatcher" convention to allow generators to
   yield special values representing a function call or a return value.
   The dispatcher function must then explicitly manage a call stack.

13. It is possible to model ultra-lightweight threads by representing
each thread as a simple generator which steps through the states
required to implement the functionality required of the "thread". The
"execution pointer" of such threads is advanced simply by calling the
".next()" method of the "thread"s generator-iterator. (Incidentally,
as well as being highly efficient in terms of resource consumption,
these ultra-lightweight threads offer much finer control of
thread-prioritisation, thread-creation, destruction and "collection",
etc.)

Now that I think I've got my head around it, I think I'm going to try
my hand at an example or two, which I will post to the newsgroup
(might be a few weeks, I'm kinda busy right now). The two main
interesting examples that come to mind are

1. A ultra-lightweight thread implementation of an
   asyncore/medusa/twisted style socket server.
2. A coroutine based version of something that is normally
   (relatively) resource-intensive: a series of SAX2 callbacks to a
   chain of ContentHandlers/Filters.

Lastly, I think I'm beginning to "get" continuations. Although the
examples given in Christian Tismers paper "Continuations and
Stackless Python"[1] are difficult to follow without the definition
of a continuation object (which seems to require intimate familiarity
with the Stackless implementation), I think I understand the general
principle.

And it's a principal I'm not really interested in pursuing, because I
can't see that it will make me any more productive, or my programs
more efficient (than they could be using the "coroutines" and
"weightless-threads" described above). And I can imagine that
continuation code could be a little brain-bending to write (thus
making me *less* productive %-), as this example of a while loop
(which I sort of understand) shows:

def looptest(n):
    this = continuation.current()
    k = this.update(n)
    if k:
        this(k-1)
    else:
        del this.link

But I can see the power inherent in the continuations paradigm.

Again, many thanks to all who responded to my original post.

Reading material.

[1] http://www.stackless.com/spcpaper.pdf
[2] http://www-106.ibm.com/developerworks/linux/library/l-pygen.html
[3] http://www-106.ibm.com/developerworks/library/l-pythrd.html
[4] http://tinyurl.com/dbyn
[5] http://www.tismer.com/research/stackless/coroutines.tim.peters.html

kind regards,

--
alan kennedy
-----------------------------------------------------
check http headers here: http://xhaus.com/headers
email alan:              http://xhaus.com/mailto/alan