Microthreads without Stackless?

Sun Sep 19 12:42:23 EDT 2004

Michael Sparks wrote:

> Assuming a language has stackful coroutines, can you *guarantee* for me that
> in the following the filehandle is always closed? (Assume that processFile
> is called from inside the start point for a co-routine, and that function
> reference process passed through is where the co-routine switching occurs.

I don't know that you can guarantee that with coroutines, which are,
after all,  powerful building blocks, because I haven't thought it
through, but I wouldn't be too surprised if you actually could
guarantee such a thing with synchronous threads built on top of
coroutines. (In fact, the PDF file I cite in the grandparent shows how
to _use_ coroutines to implement try/except).

> 
> def processFile(filename, process):
>    try:
>       f = open(filename)
>       try:
>          for i in f:
>             process(i)
>          f.close()
>       except:
>          f.close()
>    except IOError:
>       print "Couldn't open file, didn't do anything"

IMO, this example does not demonstrate a reasonable use-case for
coroutines.

Two good use cases for coroutines are:

1) In filtering, the process() function can call sub-functions for
both getting and putting data, and the subfunctions can block (e.g.
let other coroutines run).  You do not need to decide in advance
whether process() is going to have data pushed at it (as in your
example) or whether process is going to pull the data -- you write
each process to pull its input data and push its output data, and glue
the processes together with coroutines.

2) Coroutines can implement synchronous threads with much less
overhead than asynchronous threads.  With synchronous threads, you
give up the ability to have the OS automagically taskswitch for you,
but the return benefits are enormous.  Not only is the system much
easier to reason about, but an entire class of errors (and the whole
rationale for the existence of semaphores, locks, mutexes,  monitors,
or whatever your favorite synchronization term is) completely
_disappears_, along with all the crufty workaround synchronization
primitives.

(As an aside, IMO your example also fails to demonstrate good
try/except/finally hygiene, which would be a completely peripheral
argument, except that you're arguing that Python should always do the
"right thing" with this code, and you're not even using what I would
consider to be the right constructs for that.)

> I would argue that you cannot, unless you start placing restrictions on the
> coroutines in a way you (and others) haven't detailed. (Such as ensuring
> that when you throw something like exceptions.SystemExit that it propogates
> through all the co-routines and hits all the bare except clauses and isn't
> stopped by any single one of them.)

I guess this is the old "you'll put your eye out with that thing"
argument.  Yes, you could put your eye out with that thing, because as
I noted, coroutines are a powerful primitive concept.  Is it
completely unpythonic to allow things you can put your eye out with? 
Let's look at the "caveats" at the bottom of the documentation for the
threads module:

"""
# When the main thread exits, it is system defined whether the other
threads survive. On SGI IRIX using the native thread implementation,
they survive. On most other systems, they are killed without executing
try ... finally clauses or executing object destructors.

# When the main thread exits, it does not do any of its usual cleanup
(except that try ... finally clauses are honored), and the standard
I/O files are not flushed.
"""

So there you have it.  I would suspect that with coroutines, you could
build a synchronous threads mechanism which would in fact execute all
the try ... finally clauses, and (unless someone can show me
otherwise), I believe that such a synchronous thread package could
certainly make sure that the main original thread of execution would
perform some cleanup at program termination.

> The correctness of the code above is predicated on the assumption that
> all the calls it makes to other functions will resume immediately. With
> stackful coroutines you cannot guarantee this, and as a result reasoning
> accurately about code like this is near to impossible in the general case. 

There is a large application space where people use asynchronous
threads, as painful as they are to reason about, simply because it
would be _more_ painful to reason about supplying the same
functionality in a non-threaded fashion.  I submit that coroutines
would be a satisfactory solution for many of the applications which
currently use asynchronous threads, for some of the reasons I have
outlined above.

> Any implementation of coroutines or co-routine type behaviour has to take
> this sort of case into account IMO - or else reasoning about even simple
> cases and guaranteering correctness becomes much more difficult. (Consider
> "process" being object and "object.process" in the use case above)

I partly disagree.  If the powers-that-be decide that coroutines are a
useful addition to the language, then, like the "threads" module, the
base "coroutines" module should come with caveats about how, yes, you
_will_ damage your foot with this tool.  However, I believe that such
a "coroutines" module could probably be used as a building block for
higher-level synchronization primitives (such as synchronous threads)
which could contain all or most of the Pythonic safety nets we all
know and love.

In thinking about it, I have to say that the current asynchronous
threads module could probably also be used as such a building block,
albeit with much higher overhead.  If anybody really wants coroutines
added to Python, they could probably do worse than to:

1) Code a synchronous threads module on top of the asynchronous
threads module, complete with good error handling and program
termination handling.

2) Publish and get feedback on the synchronous threads module, and get
people to start adopting it and using it in applications.

3) With enough users of a synchronous threading paradigm, the lure of
huge performance gains would probably be enough to get somebody to
incorporate coroutines or Stackless into the main Python build tree,
in order to reduce the time and space overhead of synchronous
threading.

One problem with this approach may be that the performance of
synchronous threads on top of async threads is so abysmal that nobody
who needs threads would even consider it in the first place.  On the
flip side, however, if such  a "synchronous threads" module on top of
async threads became popular, then it might remove the reluctance to
add coroutines, or continuations (a la Stackless) or something similar
to the language (or at least to the CPython interpreter), because the
_normal_ interface to these functions (e.g. the synchronous threads
module) would be available in Jython or other non-CPython
implementations, built on top of regular async threads.

Regards,
Pat