[Python-ideas] Async API: some code to review

Thu Nov 1 17:44:45 CET 2012

Guido van Rossum wrote:
> On Wed, Oct 31, 2012 at 3:36 PM, Steve Dower <Steve.Dower at microsoft.com> wrote:
>> Guido van Rossum wrote:
>> There is only one reason to use 'yield from' and that is for the performance
> optimisation, which I do acknowledge and did observe in my own benchmarks.
> 
> Actually, it is not just optimization. The logic of the scheduler also becomes
> much simpler.

I'd argue that it doesn't, it just happens that the implementation of 'yield from' in the interpreter happens to match the most common case. In any case, the affected area of code (which I haven't been calling 'scheduler', which seems to have caused some confusion elsewhere) only has to be written once and never touched again. It could even be migrated into C, which should significantly improve the performance. (In wattle, this is the _Awaiter class.) 

>> I know I've been vague about our intended application (deliberately so, to try
>> and keep the discussion neutral), but I'll lay out some details.
> 
> Actually I wish you'd written this sooner. I don't know about you, but my brain
> has a hard time understanding abstractions that are presented without concrete
> use cases and implementations alongside; OTOH I delight in taking a concrete
> mess and extract abstractions from it.
> (The Twisted guys are also masters at this.)
> 
> So far I didn't really "get" the reasons you brought up for some of
> complications you introduced (like multiple Future implementations).
> Now I think I'm glimpsing your reasons.

Part of the art of conversation is figuring out how the other participants need to hear something. My apologies for not figuring this out sooner :)

>> We're working on adding support for Windows 8 apps (formerly known as Metro)
>> written in Python. These will use the new API (WinRT) which is highly
>> asynchronous - even operations such as opening a file are only* available as an
>> asynchronous function. The intention is to never block on the UI thread.
> 
> Interesting. The lack of synchronous wrappers does seem a step back, but is
> probably useful as a forcing function given the desire to keep the UI responsive
> at all times.

Indeed. Based on the Win 8 apps I regularly use, it's worked well. On the other hand, updating CPython to avoid the synchronous ones (which I've done, and will be submitting for consideration soon, once I've been able to test on an ARM device) is less fun.

>> (* Some synchronous Win32 APIs are still available from C++, but these
>> are actively discouraged and restricted in many ways. Most of Win32 is
>> not usable.)
>>
>> The model used for these async APIs is future-based: every *Async() function
>> returns a future for a task that is already running. The caller is not allowed
>> to wait on this future - the only option is to attach a callback. C# and VB use
>> their async/await keywords (good 8 min intro video on those:
>> http://www.visualstudiolaunch.com/vs2012vle/Theater?sid=1778) while JavaScript
>> and C++ have multi-line lambda support.
> 
> Erik Meijer introduced me to async/await on Elba two months ago. I was very
> excited to recognize exactly what I'd done for NDB with @tasklet and yield,
> supported by the type checking.
>
>> For Python, we are aiming for closer to the async/await model (which is also
>> how we chose the names).
> 
> If we weren't so reluctant to introduce new keywords in Python we might
> introduce await as an alias for yield from in the future.

We discussed that internally and decided that it was unnecessary, or at least that it should be a proper keyword rather than an alias (as in, you can't use 'await' to delegate to a subgenerator). I'd rather see codef added first, since that (could) remove the need for the decorators.

>> Incidentally, our early designs used yield from exclusively. It was only when
> we started discovering edge-cases where things broke, as well as the impact on
> code 'cleanliness', that we switched to yield.
> 
> Very interesting. I'd love to see a much longer narrative on this.
> (You can send it to me directly if you feel it would distract the list or if you
> feel it's inappropriate to share widely. I'll keep it under my hat as long as
> you say so.)

If I get a chance to write something up then I will do that. I'll quite happily post it publicly, though it may go on my blog rather than here - this email is going to be long enough already. There is very little already written up since we discussed most of it at a whiteboard, though I do still have some early code iterations.

>> There are three aspects of this that work better and result in cleaner code
>> with wattle than with tulip:
>>
>> - event handlers can be "async-void", such that when the event is raised by
>> the OS/GUI/device/whatever the handler can use asynchronous tasks without
>> blocking the main thread.
> 
> I think this is "fire-and-forget"? I.e. you initiate an action and then just let
> it run until completion without ever checking the result? In tulip you currently
> do that by wrapping it in a Task and calling its start() method. (BTW I think
> I'm going to get rid of
> start() -- creating a Task should just start it.)

Yes, exactly. The only thing I dislike about tulip's current approach is that it requires two functions. If/when we support it, we'd provide a decorator that does the wrapping.

>> In this case, the caller receives a future but ignores it because it
>> does not care about the final result. (We could achieve this under
>> 'yield from' by requiring a decorator, which would then probably
>> prevent other Python code from calling the handler directly. There is
>> very limited opportunity for us to reliably intercept this case.)
> 
> Are you saying that this property (you don't wait for the result) is required by
> the operation rather than an option for the user? I'm only familiar with the
> latter -- e.g. I can imagine firing off an operation that writes a log entry
> somewhere but not caring about whether it succeeded -- but I would still make it
> *possible* to check on the operation if the caller cares (what if it's a very
> important log message?).
> 
> If there's no option for the caller, the API should present itself as a regular
> function/method and the task-spawning part should be hidden inside it -- I see
> no need for the caller to know about this.
>
> What exactly do you mean by "reliably intercept this case" ? A concrete example
> would help.

You're exactly right, there is no need for the original caller (for example, Windows itself) to know about the task. However, every incoming call initially comes through a COM interface that we provide (written in C) that will then invoke the Python function. This is our opportunity to intercept by looking at the returned value from the Python function before returning to the original caller.

Under wattle, we can type check here for a Future (or compatible interface), which is only ever used for async functions. On the other hand, we cannot reliable type-check for a generator to determine whether it is supposed to be async or supposed to be an iterator.

If the interface we implement expects an iterator then we can assume that we should treat the generator like that. However, if the user intended their code to be async and used 'yield from' with no decorator, we cannot provide any useful feedback: they will simply return a sequence of null pointers that is executed as quickly as the caller wants to - there is no scheduler involved in this case.

>> - the event loop is implemented by the OS. Our Scheduler implementation does
>> not need to provide an event loop, since we can submit() calls to the OS-level
>> loop. This pattern also allows wattle to 'sit on top of' any other event loop,
>> probably including Twisted and 0MQ, though I have not tried it (except with
>> Tcl).
> 
> Ok, so what is the API offered by the OS event loop? I really want to make sure
> that tulip can interface with strange event loops, and this may be the most
> concrete example so far -- and it may be an important one.

There are three main APIs involved:

* Windows.UI.Core.CoreDispatcher.run_async() (and run_idle_async(), which uses a low priority)
* Windows.System.Threading.ThreadPool.run_async()
* any API that returns a future (==an object implementing IAsyncInfo)

Strictly, the third category covers the first two, since they both return a future, but they are also the APIs that allow the user/developer to schedule work on or off the UI thread (respectively).

For wattle, they equate directly to Scheduler.submit, Scheduler.thread_pool.submit (which wasn't in the code, but was suggested in the write-up) and Future. 

>> - Future objects can be marshalled directly from Python into Windows,
>> completing the interop story.
> 
> What do you mean by marshalled here? Surely not the stdlib marshal module.

No.

>Do you just mean that Future objects can be recognized by the foreign-function
> interface and wrapped by / copied into native Windows 8 datatypes?

Yes, this is exactly what we would do. The FFI creates a WinRT object that forwards calls between Python and Windows as necessary. (This is a general mechanism that we use for many types, so it doesn't matter how the Future is created. On a related note, returning a Future from Python code into Windows will not be a common occurrence - it is far more common for Python to consume Futures that are passed in.)

> I understand your event loop understands Futures? All of them? Or only the ones
> of the specific type that it also returns?

It's based on an interface, so as long as we can provide (equivalents of) add_done_callback() and result() then the FFI will do the rest.

>> Even with tulip, we would probably still require a decorator for this case so
>> that we can marshal regular generators as iterables (for which there is a
>> specific type).
> 
> I can't quite follow you here, probably due to lack of imagination on my part.
> Can you help me with a (somewhat) concrete example?

Given a (Windows) prototype:

IIterable<String> GetItems();

We want to allow the Python function to be implemented as:

def get_items():
    for data in ['a', 'b', 'c']:
        yield data

This is a pretty straightforward mapping: Python returns a generator, which supports the same interface as IIterable, so we can marshal the object out and convert each element to a string.

The problem is when a (possibly too keen) user writes the following code:

def get_items():
    data = yield from get_data_async()
    return data

Now the returned generator is full of None, which we will happily convert to a sequence of empty strings (==null pointers in Win8). With wattle, the yielded objects would be Futures, which would still be converted to strings, but at least are obviously incorrect. Also, since the user should be in the habit of adding @async already, we can raise an error even earlier when the return value is a future and not a generator.

Unfortunately, nothing can fix this code (except maybe a new keyword):

def get_items():
    data = yield from get_data_async()
    for item in data:
        yield item 

>> Without a decorator, we would probably have to ban both cases to prevent
> subtly misbehaving programs.
> 
> Concrete example?

Given above. By banning both cases we would always raise TypeError when a generator is returned, even if an iterable or an async operation is expected, because we can't be sure which one we have.

>> At least with wattle, the user does not have to do anything different from any
>> of their other @async functions.
> 
> This is because you can put type checks inside @async, which sees the function
> object before it's called, rather than the scheduler, which only sees what it
> returned, right? That's a trick I use in NDB as well and I think tulip will end
> up requiring a decorator too -- but it will just "mark" the function rather than
> wrap it in another one, unless the function is not a generator (in which case it
> will probably have to wrap it in something that is a generator). I could imagine
> a debug version of the decorator that added wrappers in all cases though.

It's not so much the type checks inside @async - those are basically to support non-generator functions being wrapped (though there is little benefit to this apart from maintaining a consistent interface). The benefit is that the _returned object_ is always going to be some sort of Future. 

Because of the way that our FFI will work, a simple marker on the function would be sufficient for our interop purposes. However, I don't think it is a general enough solution (for example, if the caller is already in Python then they may not get to see the function before it is called - Twisted might be affected by this, though I'm not sure).

What might work best is allowing the replacement scheduler/pollster to provide or override the decorator somehow, though I don't see any convenient way to do this 

>> Despite this intended application, I have tried to approach this design task
>> independently to produce an API that will work for many cases, especially given
>> the narrow focus on sockets. If people decide to get hung up on "the Microsoft
>> way" or similar rubbish then I will feel vindicated for not mentioning it
>> earlier :-) - it has not had any more influence on wattle than any of my other
>> past experience has.
> 
> No worries about that. I agree that we need concrete examples that takes us
> beyond the world of sockets; it's just that sockets are where most of the
> interest lies (Tornado is a webserver, Twisted is often admired because of its
> implementations of many internet protocols, people benchmark async frameworks on
> how many HTTP requests per second they can serve) and I haven't worked with any
> type of GUI framework in a very long time. (Kudos for trying your way Tk!)

I don't blame you for avoiding GUI frameworks... there are very few that work well. Hopefully when we fully support XAML-based GUIs that will change somewhat, at least for Windows developers.

Also, I didn't include the Tk scheduler in BitBucket, but just to illustrate the simplicity of wrapping an existing loop I've posted the full code below (it still has some old names in it):

import contexts

class TkContext(contexts.CallableContext):
    def __init__(self, app):
        self.app = app

    @staticmethod
    def invoke(callable, args, kwargs):
        callable(*args, **kwargs)

    def submit(self, callable, *args, **kwargs):
        '''Adds a callable to invoke within this context.'''
        self.app.after(0, TkContext.invoke, callable, args, kwargs)

Cheers,
Steve