Using asyncio workers in a `concurrent.futures` interface

Marko Rauhamaa marko at pacujo.net
Tue Aug 12 14:31:02 EDT 2014


cool-RR <ram.rachum at gmail.com>:

> If I understand correctly [asyncio] would let me run multiple uploads
> and downloads efficiently in one thread, which would conserve more
> resources than using threads.

Asyncio does make it convenient to multiplex event on one or more
threads. Threads have their uses (exploiting multiple CPUs), only you
shouldn't associate threads with every state machine, IMO. Asyncio
allows you to separate your state machines from your threads. For
example, you might have 1,000 state machines (for 1,000 connections) but
only 8 threads for 4 CPUs.

> Now, I am a little clueless about the whole way it's built, using
> coroutines and tricky usage of `yield from`.

Asyncio actively supports (at least) *two* multiplexing models:
callbacks (aka listeners or handlers) and coroutines. Programming with
callbacks involves storing the state explicitly in a state machine
object. The callbacks are supposed to never block but return
immediately. That model is a long-time favorite of many, including me.

The coroutine model is highly analogous with the multithreading model in
that you store the state implicitly in the code. Multithreading marks
the state with blocking function calls. Coroutines mark the state with
"yield from" statements. Otherwise, a multithreading implementation will
have very much the shape of a coroutine implementation.

The funky aspect of the coroutines is the way they "abuse" the "yield
from" statement, whose original purpose is to pass a series of results
from a generator to the caller by chaining lower-level generators.
Coroutines employ a "trick:" the "yield from" statement does not pass
any meaningful results. Instead, the statement is used to make the
generator/coroutine pseudoblock and switch context within the same
thread.

There's one crucial advantage coroutines have over threads: you can
multiplex events. If your thread is blocked on, say, reading a byte, you
can't tell it to stop waiting and do something else instead. Coroutines
can be made to wait on alternative stimuli.

> I looked at the asyncio documentation page and saw that it does
> mention futures and executors, which is my favorite interface for
> doing concurrency.

They are there.

My favorite model is the Actor Model, where objects communicate with
each other and the outside world through asynchornous stimuli of sorts.
The thinking goes: something happened, so how do I react to it. The
actor model just needs a class with the member "self.state", which
contains the name of the internal state of the object. Then the objects
callback methods receive the inputs send out messages and adjust the
state.

Both the actor model and the coroutines produce somewhat messy code that
somewhat hard to get right. However, that's a reflection of how messy
the reality is. Any attempts to pave it over will ultimately result in
more trouble.

>     download_file = lambda url: requests.get(url).content
>     urls = ['http://google.com/file1.jpg',
>     'http://google.com/file2.jpg', 'http://google.com/file3.jpg'] #
>     etc.
>     
>     with AsyncIOExecutor() as asyncio_executor:
>         files = asyncio_executor.map(download_file, urls)
>
> And that's it, no coroutines, no `yield from`.

I'm not quite following you. However, one feature of the coroutines is
that you must use "yield from" everywhere. You can't delegate it to a
subroutine and forget about it.

That's my main problem with coroutines. The simple function call syntax:

    y = f(x)

is replaced with the weird:

    y = yield from f(x)

> Since, if I understand correctly, asyncio requires a mainloop, it
> would make sense for the AsyncIOExecutor to have a thread of its own
> in which it could run its mainloop.
>
> Is this possible? Did someone implement this? 

Unfortunately, I have not. I have just implemented a toy example to
satisfy my curiosity. All of my code uses callbacks and select.epoll()
-- to great success.

Here's my exploration in the classic dining philosophers problem:

   <URL: http://pacujo.net/marko/philosophers.py>

I have "fixed" the faulty protocol by having an assistant break the
deadlock by occasionally nagging the philosophers to drop everything and
get back to thinking.

The main accomplishment of the exercise was that I convinced myself the
coroutines can be used for serious things as coroutines can multiplex
stimuli with asyncio.wait(..., return_when=asyncio.FIRST_COMPLETED).


Marko



More information about the Python-list mailing list