threading

Wed Apr 9 09:47:04 EDT 2014

On Wed, Apr 9, 2014 at 11:23 PM, Frank Millman <frank at chagford.com> wrote:
> Can I ask a newbie question here?

You certainly can!

> I understand that, if one uses threading, each thread *can* block without
> affecting other threads, whereas if one uses the async approach, a request
> handler must *not* block, otherwise it will hold up the entire process and
> not allow other requests to be handled.

That would be correct.

> How does one distinguish betwen 'blocking' and 'non-blocking'? Is it
> either/or, or is it some arbitrary timeout - if a handler returns within
> that time it is non-blocking, but if it exceeds it it is blocking?

No; a blocking request is one that waits until it has a response, and
a non-blocking request is one that goes off and does something, and
then comes back to you when it's done. When you turn on the kettle,
you can either stay there and watch until it's ready to make your
coffee (or, in my case, hot chocolate), or you can go away and come
back when it whistles at you to say that it's boiling. A third option,
polling, is when you put a pot of water on the stove, turn it on, and
then come back periodically to see if it's boiling yet. As the old
saying tells us, blocking I/O is a bad idea with pots of water,
because it'll never return.

> In my environment, most requests involve a database lookup. I endeavour to
> ensure that a response is returned quickly (however one defines quickly) but
> I cannot guarantee it if the database server is under stress. Is this a good
> candidate for async, or not?

No, that's a bad idea, because you have blocking I/O. If you have
multiple threads, it's fine, because the thread that's waiting for the
database will be blocked, and other threads can run (you may need to
ensure that you have separate database connections for your separate
threads); but in an asynchronous system, you want to be able to go and
do something else while you're waiting. Something like this:

def blocking_database_query(id):
    print("Finding out who employee #%d is..."%id)
    res = db.query("select name from emp where id=12345")
    print("Employee #%d is %s."%(id,res[0].name))

def nonblocking_query(id):
    print("Finding out who employee #%d is..."%id)
    def nextstep(res):
        print("Employee #%d is %s."%(id,res[0].name))
    db.asyncquery(nextstep, "select name from emp where id=12345")

This is a common way to do asynchronous I/O. Instead of saying "Do
this and give me a result", you say "Do this, and when you have a
result, call this function". Then as soon as you've done that, you
return (to some main loop, probably). It's usually a bit more
complicated than this (eg you might need multiple callbacks or
additional arguments in case it times out or otherwise fails - there's
no way to throw an exception into a callback, the way the blocking
query could throw something instead of returning), but that's the
basic concept.

You may be able to get away with doing blocking operations in
asynchronous mode, if you're confident they'll be fairly fast. But you
have to be really REALLY confident, and it does create assumptions
that can be wrong. For instance, the above code assumes that print()
won't block. You might think "Duh, how can printing to the screen
block?!?", but if your program's output is being piped into something
else, it most certainly can :) If that were writing to a remote
socket, though, it'd be better to perform those operations
asynchronously too: attempt to write to the socket; once that's done,
start the database query; when the database result arrives, write the
response to the socket; when that's done, go back to some main loop.

ChrisA