Microthreads without Stackless?

Michael Sparks michaels at rd.bbc.co.uk
Fri Sep 17 05:08:11 EDT 2004


Bryan Olson wrote:

> David Mertz, Ph.D. wrote:
>  > It's too bad you didn't bother to READ my article at:
>  >
>  >   http://gnosis.cx/publish/programming/charming_python_b5.html
>  >
>  > This is distinct from my other article that covers "weightless
>  > threads", though there is some overlap in the concepts.
> 
> I spent a couple hours going through the "weightless threads"
> paper, and looking up the background including that one.  I had
> previously concluded that Python generators could not implement
> what I wanted from real co-routines, so I was interested in
> seeing if there was a reasonable implementation.
> 
>  > While you do need a scheduler to control the branching, once you
>  > have this you get EXACTLY the same thing as coroutines in other
>  > languages. Specifically, you can branch from any generator, into
>  > the body of whatever other generator you wish.

David,

The real root of the 'problem' "Bryan Olson" is putting forward is the
fact that you can only jump between yield points in simple generators,
which are inherently single level, rather than nested. (ie the
traditional "you can't wrap generators" question)

Consider a TCP client using generators
   * You have a connection, mainloop and shutdown phase.
   * You can seperate that into a connection factory and a connection
     handler, and have the connection factory one generator and the
     connection handler another.
   * You can then either have the connection factory directly handle the
     scheduling of the connection handler by the "for x in y: yield x"
     trick or provide inter-generator communications. 

     Personally I prefer the latter - it means I can reuse the
     connection handler in a TCP server as well. This is also why I
     put 'problem' above - single level yield encourages this - I do
     not think this is a problem any more than python encouraging
     modular code elsewhere is a problem.

Consider a simple TCP client using what Bryan Olson wants (or appears to
want) - let's call them Greenlets.
   * You write your TCP client as if it were normally threaded.
   * You wrap any socket create/read/write code in a function that
     whenever it needs to block calls the equivalent of "suspend".
     (Let's call that greenlet.main.switch() ) 
   * You decorate your original function (the TCP client) as a
     greenlet, and leave your code essentially unchanged
   * You would also need a greenlet to handle selects, and a scheduler,
     but gain the same benefits as if you'd transformed your code to use
     a reactor/proactor pattern, 

At least I *think* this is where Bryan is coming from...

> Great; I'd love to have the same thing as real co-routines.
> Here's the problem:  I have a server that currently handles
> multiple clients using one thread per connection.  When a
> client-handler needs to send or receive data, it simply reads or
> writes to a file-like thing.
> 
> Now I want to handle twenty thousand clients, so I want to
> replace my threads with co-routines.  With real co-routines, I
> can easily do that.  I write an asynchronous I/O handler routine
> that can wait on many files at once, perhaps with os.select().
> I over-ride the file read and write procedures to initiate the
> I/O, then yield.  When I/O is ready on a file, the I/O handler
> can switch back to the client-handler.
> 
> Importantly, I do *not* have to re-write all the functions in
> every call chain that leads to a read or write.  I just need the
> I/O handler (which includes a kind of scheduler) at the top, and
> I over-ride my file I/O calls at the bottom.

Bryan,

Out of interest, have you looked at the Greenlets package from
stackless? I'm pretty certain it goes *a lot* further towards where
you want to be than you currently are and works with standard python.
Yes, you might need to create a simple scheduler. Yes, you will need to
decorate some functions. But it will largely allow you to leave most of
your code looking pretty much unchanged, and certainly not the level of
change you'd need if you went for a traditional statemachine/reactor (or
proactor) pattern based server.

I've mentioned it here a few times and you're the most vocal person I
think might benefit from at least giving it a look. 

Personally when I was in a similar situation to yourself a while back,
and rather than complain about a feature being missing that's missing
from many languages, I decided to modularise my code differently such
that approaches _like_ those David Mertz put forward became plausible
(I personally found the article very useful in detailing generators'
potential). 

Personally, I've found the results of using generators in the way David
Mertz put forwards to be extremely useful. This is especially true if
you allow communications between generators in CSP style semantics,
encouraging greater reuse than traditional co-routines but now I'm way
off topic. 

If you really think you need what you say you need, I really would
suggest Greenlets - it takes you a lot closer to where you say you want
to be.

Thinking-it-might-be-useful-ly,


Michael.
-- 
Michael.Sparks at rd.bbc.co.uk    
British Broadcasting Corporation, Research and Development
Kingswood Warren, Surrey KT20 6NP

This message (and any attachments) may contain personal views
which are not the views of the BBC unless specifically stated.





More information about the Python-list mailing list