Question about asyncio and blocking operations

Chris Angelico rosuav at gmail.com
Sat Jan 23 10:10:14 EST 2016


On Sun, Jan 24, 2016 at 1:38 AM, Frank Millman <frank at chagford.com> wrote:
> I find I am bumping my head more that I expected, so I thought I would try
> to get some feedback here to see if I have some flaw in my approach, or if
> it is just in the nature of writing an asynchronous-style application.

I don't have a lot of experience with Python's async/await as such,
but I've written asynchronous apps using a variety of systems (and
also written threaded ones many times). so I'll answer questions on
the basis of design principles that were passed down to me through the
generations.

> I use a cache to store frequently used objects, but I wait for the first
> request before I actually retrieve it from the database. This is how it
> worked -
>
> # cache of database objects for each company
> class DbObject(dict):
>    def __missing__(self, company):
>        db_object = self[company] = get_db_object _from_database()
>        return db_object
> db_objects = DbObjects()
>
> Any function could ask for db_cache.db_objects[company]. The first time it
> would be read from the database, on subsequent requests it would be returned
> from the dictionary.
>
> Now get_db_object_from_database() is a coroutine, so I have to change it to
>        db_object = self[company] = await get_db_object _from_database()
>
> But that is not allowed, because __missing__() is not a coroutine.
>
> I fixed it by replacing the cache with a function -
>
> # cache of database objects for each company
> db_objects = {}
> async def get_db_object(company):
>    if company not in db_objects:
>        db_object = db_objects[company] = await get_db_object
> _from_database()
>    return db_objects[company]
>
> Now the calling functions have to call 'await
> db_cache.get_db_object(company)'
>
> Ok, once I had made the change it did not feel so bad.

I would prefer the function call anyway. Subscripting a dictionary is
fine for something that's fairly cheap, but if it's potentially hugely
expensive, I'd rather see it spelled as a function call. There's
plenty of precedent for caching function calls so only the first one
is expensive.

> Now I have another problem. I have some classes which retrieve some data
> from the database during their __init__() method. I find that it is not
> allowed to call a coroutine from __init__(), and it is not allowed to turn
> __init__() into a coroutine.
>
> I imagine that I will have to split __init__() into two parts, put the
> database functionality into a separately-callable method, and then go
> through my app to find all occurrences of instantiating the object and
> follow it with an explicit call to the new method.
>
> Again, I can handle that without too much difficulty. But at this stage I do
> not know what other problems I am going to face, and how easy they will be
> to fix.

The question here is: Until you get that data from the database, what
state would the object be in? There are two basic options:

1) If the object is somewhat usable and meaningful, divide
initialization into two parts - one that sets up the object itself
(__init__) and one that fetches stuff from the database. If you can,
trigger the database fetch in __init__ so it's potentially partly done
when you come to wait for it.

2) If the object would be completely useless, use an awaitable factory
function instead. Rather than constructing an object, you ask an
asynchronous procedure to give you an object. It's a subtle change,
and by carefully managing the naming, you could make it almost
transparent in your code:

# Old way:
class User:
    def __init__(self, domain, name):
        self.id = blocking_database_call("get user", domain, name)
# And used thus:
me = User("example.com", "rosuav")

# New way:
class User:
    def __init__(self, id):
        self.id = id
_User = User
async def User(domain, name):
    id = await async_database_call("get user", domain, name)
    return _User(id)
# And used thus:
me = await User("example.com", "rosuav")

> So I thought I would ask here if anyone has been through a similar exercise,
> and if what I am going through sounds normal, or if I am doing something
> fundamentally wrong.

I think this looks pretty much right. There are some small things you
can do to make it look a bit easier, but it's minor.

ChrisA



More information about the Python-list mailing list