Question about asyncio and blocking operations

Sat Jan 23 09:38:08 EST 2016

Hi all

I am developing a typical accounting/business application which involves a 
front-end allowing clients to access the system, a back-end connecting to a 
database, and a middle layer that glues it all together.

Some time ago I converted the front-end from a multi-threaded approach to an 
asyncio approach. It was surprisingly easy, and did not require me to delve 
into asyncio too deeply.

There was one aspect that I deliberately ignored at that stage. I did not 
change the database access to an asyncio approach, so all reading 
from/writing to the database involved a blocking operation. I am now ready 
to tackle that.

I find I am bumping my head more that I expected, so I thought I would try 
to get some feedback here to see if I have some flaw in my approach, or if 
it is just in the nature of writing an asynchronous-style application.

Here is the difficulty. The recommended way to handle a blocking operation 
is to run it as task in a different thread, using run_in_executor(). This 
method is a coroutine. An implication of this is that any method that calls 
it must also be a coroutine, so I end up with a chain of coroutines 
stretching all the way back to the initial event that triggered it. I can 
understand why this is necessary, but it does lead to some awkward 
programming.

I use a cache to store frequently used objects, but I wait for the first 
request before I actually retrieve it from the database. This is how it 
worked -

# cache of database objects for each company
class DbObject(dict):
    def __missing__(self, company):
        db_object = self[company] = get_db_object _from_database()
        return db_object
db_objects = DbObjects()

Any function could ask for db_cache.db_objects[company]. The first time it 
would be read from the database, on subsequent requests it would be returned 
from the dictionary.

Now get_db_object_from_database() is a coroutine, so I have to change it to
        db_object = self[company] = await get_db_object _from_database()

But that is not allowed, because __missing__() is not a coroutine.

I fixed it by replacing the cache with a function -

# cache of database objects for each company
db_objects = {}
async def get_db_object(company):
    if company not in db_objects:
        db_object = db_objects[company] = await get_db_object 
_from_database()
    return db_objects[company]

Now the calling functions have to call 'await 
db_cache.get_db_object(company)'

Ok, once I had made the change it did not feel so bad.

Now I have another problem. I have some classes which retrieve some data 
from the database during their __init__() method. I find that it is not 
allowed to call a coroutine from __init__(), and it is not allowed to turn 
__init__() into a coroutine.

I imagine that I will have to split __init__() into two parts, put the 
database functionality into a separately-callable method, and then go 
through my app to find all occurrences of instantiating the object and 
follow it with an explicit call to the new method.

Again, I can handle that without too much difficulty. But at this stage I do 
not know what other problems I am going to face, and how easy they will be 
to fix.

So I thought I would ask here if anyone has been through a similar exercise, 
and if what I am going through sounds normal, or if I am doing something 
fundamentally wrong.

Thanks for any input

Frank Millman