Notice: While Javascript is not essential for this website, your interaction with the content will be limited. Please turn Javascript on for the full experience.

PEP 492 -- Coroutines with async and await syntax

PEP: 492
Title: Coroutines with async and await syntax
Author: Yury Selivanov <yury at magic.io>
Discussions-To: < python-dev at python.org >
Status: Final
Type: Standards Track
Created: 09-Apr-2015
Python-Version: 3.5
Post-History: 17-Apr-2015, 21-Apr-2015, 27-Apr-2015, 29-Apr-2015, 05-May-2015

Abstract

The growth of Internet and general connectivity has triggered the proportionate need for responsive and scalable code. This proposal aims to answer that need by making writing explicitly asynchronous, concurrent Python code easier and more Pythonic.

It is proposed to make coroutines a proper standalone concept in Python, and introduce new supporting syntax. The ultimate goal is to help establish a common, easily approachable, mental model of asynchronous programming in Python and make it as close to synchronous programming as possible.

This PEP assumes that the asynchronous tasks are scheduled and coordinated by an Event Loop similar to that of stdlib module asyncio.events.AbstractEventLoop . While the PEP is not tied to any specific Event Loop implementation, it is relevant only to the kind of coroutine that uses yield as a signal to the scheduler, indicating that the coroutine will be waiting until an event (such as IO) is completed.

We believe that the changes proposed here will help keep Python relevant and competitive in a quickly growing area of asynchronous programming, as many other languages have adopted, or are planning to adopt, similar features: [2] , [5] , [6] , [7] , [8] , [10] .

API Design and Implementation Revisions

  1. Feedback on the initial beta release of Python 3.5 resulted in a redesign of the object model supporting this PEP to more clearly separate native coroutines from generators - rather than being a new kind of generator, native coroutines are now their own completely distinct type (implemented in [17] ).

    This change was implemented based primarily due to problems encountered attempting to integrate support for native coroutines into the Tornado web server (reported in [18] ).

  2. In CPython 3.5.2, the __aiter__ protocol was updated.

    Before 3.5.2, __aiter__ was expected to return an awaitable resolving to an asynchronous iterator . Starting with 3.5.2, __aiter__ should return asynchronous iterators directly.

    If the old protocol is used in 3.5.2, Python will raise a PendingDeprecationWarning .

    In CPython 3.6, the old __aiter__ protocol will still be supported with a DeprecationWarning being raised.

    In CPython 3.7, the old __aiter__ protocol will no longer be supported: a RuntimeError will be raised if __aiter__ returns anything but an asynchronous iterator.

    See [19] and [20] for more details.

Rationale and Goals

Current Python supports implementing coroutines via generators ( PEP 342 ), further enhanced by the yield from syntax introduced in PEP 380 . This approach has a number of shortcomings:

  • It is easy to confuse coroutines with regular generators, since they share the same syntax; this is especially true for new developers.
  • Whether or not a function is a coroutine is determined by a presence of yield or yield from statements in its body , which can lead to unobvious errors when such statements appear in or disappear from function body during refactoring.
  • Support for asynchronous calls is limited to expressions where yield is allowed syntactically, limiting the usefulness of syntactic features, such as with and for statements.

This proposal makes coroutines a native Python language feature, and clearly separates them from generators. This removes generator/coroutine ambiguity, and makes it possible to reliably define coroutines without reliance on a specific library. This also enables linters and IDEs to improve static code analysis and refactoring.

Native coroutines and the associated new syntax features make it possible to define context manager and iteration protocols in asynchronous terms. As shown later in this proposal, the new async with statement lets Python programs perform asynchronous calls when entering and exiting a runtime context, and the new async for statement makes it possible to perform asynchronous calls in iterators.

Specification

This proposal introduces new syntax and semantics to enhance coroutine support in Python.

This specification presumes knowledge of the implementation of coroutines in Python ( PEP 342 and PEP 380 ). Motivation for the syntax changes proposed here comes from the asyncio framework ( PEP 3156 ) and the "Cofunctions" proposal ( PEP 3152 , now rejected in favor of this specification).

From this point in this document we use the word native coroutine to refer to functions declared using the new syntax. generator-based coroutine is used where necessary to refer to coroutines that are based on generator syntax. coroutine is used in contexts where both definitions are applicable.

New Coroutine Declaration Syntax

The following new syntax is used to declare a native coroutine :

async def read_data(db):
    pass

Key properties of coroutines :

  • async def functions are always coroutines, even if they do not contain await expressions.
  • It is a SyntaxError to have yield or yield from expressions in an async function.
  • Internally, two new code object flags were introduced:
    • CO_COROUTINE is used to mark native coroutines (defined with new syntax).
    • CO_ITERABLE_COROUTINE is used to make generator-based coroutines compatible with native coroutines (set by types.coroutine() function).
  • Regular generators, when called, return a generator object ; similarly, coroutines return a coroutine object.
  • StopIteration exceptions are not propagated out of coroutines, and are replaced with a RuntimeError . For regular generators such behavior requires a future import (see PEP 479 ).
  • When a coroutine is garbage collected, a RuntimeWarning is raised if it was never awaited on (see also Debugging Features ).
  • See also Coroutine objects section.

types.coroutine()

A new function coroutine(fn) is added to the types module. It allows interoperability between existing generator-based coroutines in asyncio and native coroutines introduced by this PEP:

@types.coroutine
def process_data(db):
    data = yield from read_data(db)
    ...

The function applies CO_ITERABLE_COROUTINE flag to generator- function's code object, making it return a coroutine object.

If fn is not a generator function , it is wrapped. If it returns a generator , it will be wrapped in an awaitable proxy object (see below the definition of awaitable objects).

Note, that the CO_COROUTINE flag is not applied by types.coroutine() to make it possible to separate native coroutines defined with new syntax, from generator-based coroutines .

Await Expression

The following new await expression is used to obtain a result of coroutine execution:

async def read_data(db):
    data = await db.fetch('SELECT ...')
    ...

await , similarly to yield from , suspends execution of read_data coroutine until db.fetch awaitable completes and returns the result data.

It uses the yield from implementation with an extra step of validating its argument. await only accepts an awaitable , which can be one of:

  • A native coroutine object returned from a native coroutine function .

  • A generator-based coroutine object returned from a function decorated with types.coroutine() .

  • An object with an __await__ method returning an iterator.

    Any yield from chain of calls ends with a yield . This is a fundamental mechanism of how Futures are implemented. Since, internally, coroutines are a special kind of generators, every await is suspended by a yield somewhere down the chain of await calls (please refer to PEP 3156 for a detailed explanation).

    To enable this behavior for coroutines, a new magic method called __await__ is added. In asyncio, for instance, to enable Future objects in await statements, the only change is to add __await__ = __iter__ line to asyncio.Future class.

    Objects with __await__ method are called Future-like objects in the rest of this PEP.

    It is a TypeError if __await__ returns anything but an iterator.

  • Objects defined with CPython C API with a tp_as_async.am_await function, returning an iterator (similar to __await__ method).

It is a SyntaxError to use await outside of an async def function (like it is a SyntaxError to use yield outside of def function).

It is a TypeError to pass anything other than an awaitable object to an await expression.

Updated operator precedence table

await keyword is defined as follows:

power ::=  await ["**" u_expr]
await ::=  ["await"] primary

where "primary" represents the most tightly bound operations of the language. Its syntax is:

primary ::=  atom | attributeref | subscription | slicing | call

See Python Documentation [12] and Grammar Updates section of this proposal for details.

The key await difference from yield and yield from operators is that await expressions do not require parentheses around them most of the times.

Also, yield from allows any expression as its argument, including expressions like yield from a() + b() , that would be parsed as yield from (a() + b()) , which is almost always a bug. In general, the result of any arithmetic operation is not an awaitable object. To avoid this kind of mistakes, it was decided to make await precedence lower than [] , () , and . , but higher than ** operators.

Operator Description
yield x , yield from x Yield expression
lambda Lambda expression
if -- else Conditional expression
or Boolean OR
and Boolean AND
not x Boolean NOT
in , not in , is , is not , < , <= , > , >= , != , == Comparisons, including membership tests and identity tests
| Bitwise OR
^ Bitwise XOR
& Bitwise AND
<< , >> Shifts
+ , - Addition and subtraction
* , @ , / , // , % Multiplication, matrix multiplication, division, remainder
+x , -x , ~x Positive, negative, bitwise NOT
** Exponentiation
await x Await expression
x[index] , x[index:index] , x(arguments...) , x.attribute Subscription, slicing, call, attribute reference
(expressions...) , [expressions...] , {key: value...} , {expressions...} Binding or tuple display, list display, dictionary display, set display

Examples of "await" expressions

Valid syntax examples:

Expression Will be parsed as
if await fut: pass if (await fut): pass
if await fut + 1: pass if (await fut) + 1: pass
pair = await fut, 'spam' pair = (await fut), 'spam'
with await fut, open(): pass with (await fut), open(): pass
await foo()['spam'].baz()() await ( foo()['spam'].baz()() )
return await coro() return ( await coro() )
res = await coro() ** 2 res = (await coro()) ** 2
func(a1=await coro(), a2=0) func(a1=(await coro()), a2=0)
await foo() + await bar() (await foo()) + (await bar())
-await foo() -(await foo())

Invalid syntax examples:

Expression Should be written as
await await coro() await (await coro())
await -coro() await (-coro())

Asynchronous Context Managers and "async with"

An asynchronous context manager is a context manager that is able to suspend execution in its enter and exit methods.

To make this possible, a new protocol for asynchronous context managers is proposed. Two new magic methods are added: __aenter__ and __aexit__ . Both must return an awaitable .

An example of an asynchronous context manager:

class AsyncContextManager:
    async def __aenter__(self):
        await log('entering context')

    async def __aexit__(self, exc_type, exc, tb):
        await log('exiting context')

New Syntax

A new statement for asynchronous context managers is proposed:

async with EXPR as VAR:
    BLOCK

which is semantically equivalent to:

mgr = (EXPR)
aexit = type(mgr).__aexit__
aenter = type(mgr).__aenter__(mgr)
exc = True

VAR = await aenter
try:
    BLOCK
except:
    if not await aexit(mgr, *sys.exc_info()):
        raise
else:
    await aexit(mgr, None, None, None)

As with regular with statements, it is possible to specify multiple context managers in a single async with statement.

It is an error to pass a regular context manager without __aenter__ and __aexit__ methods to async with . It is a SyntaxError to use async with outside of an async def function.

Example

With asynchronous context managers it is easy to implement proper database transaction managers for coroutines:

async def commit(session, data):
    ...

    async with session.transaction():
        ...
        await session.update(data)
        ...

Code that needs locking also looks lighter:

async with lock:
    ...

instead of:

with (yield from lock):
    ...

Asynchronous Iterators and "async for"

An asynchronous iterable is able to call asynchronous code in its iter implementation, and asynchronous iterator can call asynchronous code in its next method. To support asynchronous iteration:

  1. An object must implement an __aiter__ method (or, if defined with CPython C API, tp_as_async.am_aiter slot) returning an asynchronous iterator object .
  2. An asynchronous iterator object must implement an __anext__ method (or, if defined with CPython C API, tp_as_async.am_anext slot) returning an awaitable .
  3. To stop iteration __anext__ must raise a StopAsyncIteration exception.

An example of asynchronous iterable:

class AsyncIterable:
    def __aiter__(self):
        return self

    async def __anext__(self):
        data = await self.fetch_data()
        if data:
            return data
        else:
            raise StopAsyncIteration

    async def fetch_data(self):
        ...

New Syntax

A new statement for iterating through asynchronous iterators is proposed:

async for TARGET in ITER:
    BLOCK
else:
    BLOCK2

which is semantically equivalent to:

iter = (ITER)
iter = type(iter).__aiter__(iter)
running = True
while running:
    try:
        TARGET = await type(iter).__anext__(iter)
    except StopAsyncIteration:
        running = False
    else:
        BLOCK
else:
    BLOCK2

It is a TypeError to pass a regular iterable without __aiter__ method to async for . It is a SyntaxError to use async for outside of an async def function.

As for with regular for statement, async for has an optional else clause.

Example 1

With asynchronous iteration protocol it is possible to asynchronously buffer data during iteration:

async for data in cursor:
    ...

Where cursor is an asynchronous iterator that prefetches N rows of data from a database after every N iterations.

The following code illustrates new asynchronous iteration protocol:

class Cursor:
    def __init__(self):
        self.buffer = collections.deque()

    async def _prefetch(self):
        ...

    def __aiter__(self):
        return self

    async def __anext__(self):
        if not self.buffer:
            self.buffer = await self._prefetch()
            if not self.buffer:
                raise StopAsyncIteration
        return self.buffer.popleft()

then the Cursor class can be used as follows:

async for row in Cursor():
    print(row)

which would be equivalent to the following code:

i = Cursor().__aiter__()
while True:
    try:
        row = await i.__anext__()
    except StopAsyncIteration:
        break
    else:
        print(row)

Example 2

The following is a utility class that transforms a regular iterable to an asynchronous one. While this is not a very useful thing to do, the code illustrates the relationship between regular and asynchronous iterators.

class AsyncIteratorWrapper:
    def __init__(self, obj):
        self._it = iter(obj)

    def __aiter__(self):
        return self

    async def __anext__(self):
        try:
            value = next(self._it)
        except StopIteration:
            raise StopAsyncIteration
        return value

async for letter in AsyncIteratorWrapper("abc"):
    print(letter)

Why StopAsyncIteration?

Coroutines are still based on generators internally. So, before PEP 479 , there was no fundamental difference between

def g1():
    yield from fut
    return 'spam'

and

def g2():
    yield from fut
    raise StopIteration('spam')

And since PEP 479 is accepted and enabled by default for coroutines, the following example will have its StopIteration wrapped into a RuntimeError

async def a1():
    await fut
    raise StopIteration('spam')

The only way to tell the outside code that the iteration has ended is to raise something other than StopIteration . Therefore, a new built-in exception class StopAsyncIteration was added.

Moreover, with semantics from PEP 479 , all StopIteration exceptions raised in coroutines are wrapped in RuntimeError .

Coroutine objects

Differences from generators

This section applies only to native coroutines with CO_COROUTINE flag, i.e. defined with the new async def syntax.

The behavior of existing *generator-based coroutines* in asyncio remains unchanged.

Great effort has been made to make sure that coroutines and generators are treated as distinct concepts:

  1. Native coroutine objects do not implement __iter__ and __next__ methods. Therefore, they cannot be iterated over or passed to iter() , list() , tuple() and other built-ins. They also cannot be used in a for..in loop.

    An attempt to use __iter__ or __next__ on a native coroutine object will result in a TypeError .

  2. Plain generators cannot yield from native coroutines : doing so will result in a TypeError .

  3. generator-based coroutines (for asyncio code must be decorated with @asyncio.coroutine ) can yield from native coroutine objects .

  4. inspect.isgenerator() and inspect.isgeneratorfunction() return False for native coroutine objects and native coroutine functions .

Coroutine object methods

Coroutines are based on generators internally, thus they share the implementation. Similarly to generator objects, coroutines have throw() , send() and close() methods. StopIteration and GeneratorExit play the same role for coroutines (although PEP 479 is enabled by default for coroutines). See PEP 342 , PEP 380 , and Python Documentation [11] for details.

throw() , send() methods for coroutines are used to push values and raise errors into Future-like objects.

Debugging Features

A common beginner mistake is forgetting to use yield from on coroutines:

@asyncio.coroutine
def useful():
    asyncio.sleep(1) # this will do nothing without 'yield from'

For debugging this kind of mistakes there is a special debug mode in asyncio, in which @coroutine decorator wraps all functions with a special object with a destructor logging a warning. Whenever a wrapped generator gets garbage collected, a detailed logging message is generated with information about where exactly the decorator function was defined, stack trace of where it was collected, etc. Wrapper object also provides a convenient __repr__ function with detailed information about the generator.

The only problem is how to enable these debug capabilities. Since debug facilities should be a no-op in production mode, @coroutine decorator makes the decision of whether to wrap or not to wrap based on an OS environment variable PYTHONASYNCIODEBUG . This way it is possible to run asyncio programs with asyncio's own functions instrumented. EventLoop.set_debug , a different debug facility, has no impact on @coroutine decorator's behavior.

With this proposal, coroutines is a native, distinct from generators, concept. In addition to a RuntimeWarning being raised on coroutines that were never awaited, it is proposed to add two new functions to the sys module: set_coroutine_wrapper and get_coroutine_wrapper . This is to enable advanced debugging facilities in asyncio and other frameworks (such as displaying where exactly coroutine was created, and a more detailed stack trace of where it was garbage collected).

New Standard Library Functions

  • types.coroutine(gen) . See types.coroutine() section for details.
  • inspect.iscoroutine(obj) returns True if obj is a native coroutine object.
  • inspect.iscoroutinefunction(obj) returns True if obj is a native coroutine function .
  • inspect.isawaitable(obj) returns True if obj is an awaitable .
  • inspect.getcoroutinestate(coro) returns the current state of a native coroutine object (mirrors inspect.getfgeneratorstate(gen) ).
  • inspect.getcoroutinelocals(coro) returns the mapping of a native coroutine object's local variables to their values (mirrors inspect.getgeneratorlocals(gen) ).
  • sys.set_coroutine_wrapper(wrapper) allows to intercept creation of native coroutine objects. wrapper must be either a callable that accepts one argument (a coroutine object), or None . None resets the wrapper. If called twice, the new wrapper replaces the previous one. The function is thread-specific. See Debugging Features for more details.
  • sys.get_coroutine_wrapper() returns the current wrapper object. Returns None if no wrapper was set. The function is thread-specific. See Debugging Features for more details.

New Abstract Base Classes

In order to allow better integration with existing frameworks (such as Tornado, see [13] ) and compilers (such as Cython, see [16] ), two new Abstract Base Classes (ABC) are added:

  • collections.abc.Awaitable ABC for Future-like classes, that implement __await__ method.

  • collections.abc.Coroutine ABC for coroutine objects, that implement send(value) , throw(type, exc, tb) , close() and __await__() methods.

    Note that generator-based coroutines with CO_ITERABLE_COROUTINE flag do not implement __await__ method, and therefore are not instances of collections.abc.Coroutine and collections.abc.Awaitable ABCs:

    @types.coroutine
    def gencoro():
        yield
    
    assert not isinstance(gencoro(), collections.abc.Coroutine)
    
    # however:
    assert inspect.isawaitable(gencoro())
    

To allow easy testing if objects support asynchronous iteration, two more ABCs are added:

  • collections.abc.AsyncIterable -- tests for __aiter__ method.
  • collections.abc.AsyncIterator -- tests for __aiter__ and __anext__ methods.

Glossary

Native coroutine function
A coroutine function is declared with async def . It uses await and return value ; see New Coroutine Declaration Syntax for details.
Native coroutine
Returned from a native coroutine function. See Await Expression for details.
Generator-based coroutine function
Coroutines based on generator syntax. Most common example are functions decorated with @asyncio.coroutine .
Generator-based coroutine
Returned from a generator-based coroutine function.
Coroutine
Either native coroutine or generator-based coroutine .
Coroutine object
Either native coroutine object or generator-based coroutine object.
Future-like object
An object with an __await__ method, or a C object with tp_as_async->am_await function, returning an iterator . Can be consumed by an await expression in a coroutine. A coroutine waiting for a Future-like object is suspended until the Future-like object's __await__ completes, and returns the result. See Await Expression for details.
Awaitable
A Future-like object or a coroutine object. See Await Expression for details.
Asynchronous context manager
An asynchronous context manager has __aenter__ and __aexit__ methods and can be used with async with . See Asynchronous Context Managers and "async with" for details.
Asynchronous iterable
An object with an __aiter__ method, which must return an asynchronous iterator object. Can be used with async for . See Asynchronous Iterators and "async for" for details.
Asynchronous iterator
An asynchronous iterator has an __anext__ method. See Asynchronous Iterators and "async for" for details.

Transition Plan

To avoid backwards compatibility issues with async and await keywords, it was decided to modify tokenizer.c in such a way, that it:

  • recognizes async def NAME tokens combination;
  • while tokenizing async def block, it replaces 'async' NAME token with ASYNC , and 'await' NAME token with AWAIT ;
  • while tokenizing def block, it yields 'async' and 'await' NAME tokens as is.

This approach allows for seamless combination of new syntax features (all of them available only in async functions) with any existing code.

An example of having "async def" and "async" attribute in one piece of code:

class Spam:
    async = 42

async def ham():
    print(getattr(Spam, 'async'))

# The coroutine can be executed and will print '42'

Backwards Compatibility

This proposal preserves 100% backwards compatibility.

asyncio

asyncio module was adapted and tested to work with coroutines and new statements. Backwards compatibility is 100% preserved, i.e. all existing code will work as-is.

The required changes are mainly:

  1. Modify @asyncio.coroutine decorator to use new types.coroutine() function.
  2. Add __await__ = __iter__ line to asyncio.Future class.
  3. Add ensure_future() as an alias for async() function. Deprecate async() function.

asyncio migration strategy

Because plain generators cannot yield from native coroutine objects (see Differences from generators section for more details), it is advised to make sure that all generator-based coroutines are decorated with @asyncio.coroutine before starting to use the new syntax.

async/await in CPython code base

There is no use of await names in CPython.

async is mostly used by asyncio. We are addressing this by renaming async() function to ensure_future() (see asyncio section for details).

Another use of async keyword is in Lib/xml/dom/xmlbuilder.py , to define an async = False attribute for DocumentLS class. There is no documentation or tests for it, it is not used anywhere else in CPython. It is replaced with a getter, that raises a DeprecationWarning , advising to use async_ attribute instead. 'async' attribute is not documented and is not used in CPython code base.

Grammar Updates

Grammar changes are fairly minimal:

decorated: decorators (classdef | funcdef | async_funcdef)
async_funcdef: ASYNC funcdef

compound_stmt: (if_stmt | while_stmt | for_stmt | try_stmt | with_stmt
                | funcdef | classdef | decorated | async_stmt)

async_stmt: ASYNC (funcdef | with_stmt | for_stmt)

power: atom_expr ['**' factor]
atom_expr: [AWAIT] atom trailer*

Deprecation Plans

async and await names will be softly deprecated in CPython 3.5 and 3.6. In 3.7 we will transform them to proper keywords. Making async and await proper keywords before 3.7 might make it harder for people to port their code to Python 3.

Design Considerations

PEP 3152

PEP 3152 by Gregory Ewing proposes a different mechanism for coroutines (called "cofunctions"). Some key points:

  1. A new keyword codef to declare a cofunction . Cofunction is always a generator, even if there is no cocall expressions inside it. Maps to async def in this proposal.

  2. A new keyword cocall to call a cofunction . Can only be used inside a cofunction . Maps to await in this proposal (with some differences, see below).

  3. It is not possible to call a cofunction without a cocall keyword.

  4. cocall grammatically requires parentheses after it:

    atom: cocall | <existing alternatives for atom>
    cocall: 'cocall' atom cotrailer* '(' [arglist] ')'
    cotrailer: '[' subscriptlist ']' | '.' NAME
    
  5. cocall f(*args, **kwds) is semantically equivalent to yield from f.__cocall__(*args, **kwds) .

Differences from this proposal:

  1. There is no equivalent of __cocall__ in this PEP, which is called and its result is passed to yield from in the cocall expression. await keyword expects an awaitable object, validates the type, and executes yield from on it. Although, __await__ method is similar to __cocall__ , but is only used to define Future-like objects.

  2. await is defined in almost the same way as yield from in the grammar (it is later enforced that await can only be inside async def ). It is possible to simply write await future , whereas cocall always requires parentheses.

  3. To make asyncio work with PEP 3152 it would be required to modify @asyncio.coroutine decorator to wrap all functions in an object with a __cocall__ method, or to implement __cocall__ on generators. To call cofunctions from existing generator-based coroutines it would be required to use costart(cofunc, *args, **kwargs) built-in.

  4. Since it is impossible to call a cofunction without a cocall keyword, it automatically prevents the common mistake of forgetting to use yield from on generator-based coroutines. This proposal addresses this problem with a different approach, see Debugging Features .

  5. A shortcoming of requiring a cocall keyword to call a coroutine is that if is decided to implement coroutine-generators -- coroutines with yield or async yield expressions -- we wouldn't need a cocall keyword to call them. So we'll end up having __cocall__ and no __call__ for regular coroutines, and having __call__ and no __cocall__ for coroutine- generators.

  6. Requiring parentheses grammatically also introduces a whole lot of new problems.

    The following code:

    await fut
    await function_returning_future()
    await asyncio.gather(coro1(arg1, arg2), coro2(arg1, arg2))
    

    would look like:

    cocall fut()  # or cocall costart(fut)
    cocall (function_returning_future())()
    cocall asyncio.gather(costart(coro1, arg1, arg2),
                          costart(coro2, arg1, arg2))
    
  7. There are no equivalents of async for and async with in PEP 3152 .

Coroutine-generators

With async for keyword it is desirable to have a concept of a coroutine-generator -- a coroutine with yield and yield from expressions. To avoid any ambiguity with regular generators, we would likely require to have an async keyword before yield , and async yield from would raise a StopAsyncIteration exception.

While it is possible to implement coroutine-generators, we believe that they are out of scope of this proposal. It is an advanced concept that should be carefully considered and balanced, with a non-trivial changes in the implementation of current generator objects. This is a matter for a separate PEP.

Why "async" and "await" keywords

async/await is not a new concept in programming languages:

  • C# has it since long time ago [5] ;
  • proposal to add async/await in ECMAScript 7 [2] ; see also Traceur project [9] ;
  • Facebook's Hack/HHVM [6] ;
  • Google's Dart language [7] ;
  • Scala [8] ;
  • proposal to add async/await to C++ [10] ;
  • and many other less popular languages.

This is a huge benefit, as some users already have experience with async/await, and because it makes working with many languages in one project easier (Python with ECMAScript 7 for instance).

Why "__aiter__" does not return an awaitable

PEP 492 was accepted in CPython 3.5.0 with __aiter__ defined as a method, that was expected to return an awaitable resolving to an asynchronous iterator.

In 3.5.2 (as PEP 492 was accepted on a provisional basis) the __aiter__ protocol was updated to return asynchronous iterators directly.

The motivation behind this change is to make it possible to implement asynchronous generators in Python. See [19] and [20] for more details.

Importance of "async" keyword

While it is possible to just implement await expression and treat all functions with at least one await as coroutines, this approach makes APIs design, code refactoring and its long time support harder.

Let's pretend that Python only has await keyword:

def useful():
    ...
    await log(...)
    ...

def important():
    await useful()

If useful() function is refactored and someone removes all await expressions from it, it would become a regular python function, and all code that depends on it, including important() would be broken. To mitigate this issue a decorator similar to @asyncio.coroutine has to be introduced.

Why "async def"

For some people bare async name(): pass syntax might look more appealing than async def name(): pass . It is certainly easier to type. But on the other hand, it breaks the symmetry between async def , async with and async for , where async is a modifier, stating that the statement is asynchronous. It is also more consistent with the existing grammar.

Why not "await for" and "await with"

async is an adjective, and hence it is a better choice for a statement qualifier keyword. await for/with would imply that something is awaiting for a completion of a for or with statement.

Why "async def" and not "def async"

async keyword is a statement qualifier . A good analogy to it are "static", "public", "unsafe" keywords from other languages. "async for" is an asynchronous "for" statement, "async with" is an asynchronous "with" statement, "async def" is an asynchronous function.

Having "async" after the main statement keyword might introduce some confusion, like "for async item in iterator" can be read as "for each asynchronous item in iterator".

Having async keyword before def , with and for also makes the language grammar simpler. And "async def" better separates coroutines from regular functions visually.

Why not a __future__ import

Transition Plan section explains how tokenizer is modified to treat async and await as keywords only in async def blocks. Hence async def fills the role that a module level compiler declaration like from __future__ import async_await would otherwise fill.

Why magic methods start with "a"

New asynchronous magic methods __aiter__ , __anext__ , __aenter__ , and __aexit__ all start with the same prefix "a". An alternative proposal is to use "async" prefix, so that __anext__ becomes __async_next__ . However, to align new magic methods with the existing ones, such as __radd__ and __iadd__ it was decided to use a shorter version.

Why not reuse existing magic names

An alternative idea about new asynchronous iterators and context managers was to reuse existing magic methods, by adding an async keyword to their declarations:

class CM:
    async def __enter__(self): # instead of __aenter__
        ...

This approach has the following downsides:

  • it would not be possible to create an object that works in both with and async with statements;
  • it would break backwards compatibility, as nothing prohibits from returning a Future-like objects from __enter__ and/or __exit__ in Python <= 3.4;
  • one of the main points of this proposal is to make native coroutines as simple and foolproof as possible, hence the clear separation of the protocols.

Why not reuse existing "for" and "with" statements

The vision behind existing generator-based coroutines and this proposal is to make it easy for users to see where the code might be suspended. Making existing "for" and "with" statements to recognize asynchronous iterators and context managers will inevitably create implicit suspend points, making it harder to reason about the code.

Comprehensions

Syntax for asynchronous comprehensions could be provided, but this construct is outside of the scope of this PEP.

Async lambda functions

Syntax for asynchronous lambda functions could be provided, but this construct is outside of the scope of this PEP.

Performance

Overall Impact

This proposal introduces no observable performance impact. Here is an output of python's official set of benchmarks [4] :

python perf.py -r -b default ../cpython/python.exe ../cpython-aw/python.exe

[skipped]

Report on Darwin ysmac 14.3.0 Darwin Kernel Version 14.3.0:
Mon Mar 23 11:59:05 PDT 2015; root:xnu-2782.20.48~5/RELEASE_X86_64
x86_64 i386

Total CPU cores: 8

### etree_iterparse ###
Min: 0.365359 -> 0.349168: 1.05x faster
Avg: 0.396924 -> 0.379735: 1.05x faster
Significant (t=9.71)
Stddev: 0.01225 -> 0.01277: 1.0423x larger

The following not significant results are hidden, use -v to show them:
django_v2, 2to3, etree_generate, etree_parse, etree_process, fastpickle,
fastunpickle, json_dump_v2, json_load, nbody, regex_v8, tornado_http.

Tokenizer modifications

There is no observable slowdown of parsing python files with the modified tokenizer: parsing of one 12Mb file ( Lib/test/test_binop.py repeated 1000 times) takes the same amount of time.

async/await

The following micro-benchmark was used to determine performance difference between "async" functions and generators:

import sys
import time

def binary(n):
    if n <= 0:
        return 1
    l = yield from binary(n - 1)
    r = yield from binary(n - 1)
    return l + 1 + r

async def abinary(n):
    if n <= 0:
        return 1
    l = await abinary(n - 1)
    r = await abinary(n - 1)
    return l + 1 + r

def timeit(func, depth, repeat):
    t0 = time.time()
    for _ in range(repeat):
        o = func(depth)
        try:
            while True:
                o.send(None)
        except StopIteration:
            pass
    t1 = time.time()
    print('{}({}) * {}: total {:.3f}s'.format(
        func.__name__, depth, repeat, t1-t0))

The result is that there is no observable performance difference:

binary(19) * 30: total 53.321s
abinary(19) * 30: total 55.073s

binary(19) * 30: total 53.361s
abinary(19) * 30: total 51.360s

binary(19) * 30: total 49.438s
abinary(19) * 30: total 51.047s

Note that depth of 19 means 1,048,575 calls.

Reference Implementation

The reference implementation can be found here: [3] .

List of high-level changes and new protocols

  1. New syntax for defining coroutines: async def and new await keyword.
  2. New __await__ method for Future-like objects, and new tp_as_async.am_await slot in PyTypeObject .
  3. New syntax for asynchronous context managers: async with . And associated protocol with __aenter__ and __aexit__ methods.
  4. New syntax for asynchronous iteration: async for . And associated protocol with __aiter__ , __aexit__ and new built- in exception StopAsyncIteration . New tp_as_async.am_aiter and tp_as_async.am_anext slots in PyTypeObject .
  5. New AST nodes: AsyncFunctionDef , AsyncFor , AsyncWith , Await .
  6. New functions: sys.set_coroutine_wrapper(callback) , sys.get_coroutine_wrapper() , types.coroutine(gen) , inspect.iscoroutinefunction(func) , inspect.iscoroutine(obj) , inspect.isawaitable(obj) , inspect.getcoroutinestate(coro) , and inspect.getcoroutinelocals(coro) .
  7. New CO_COROUTINE and CO_ITERABLE_COROUTINE bit flags for code objects.
  8. New ABCs: collections.abc.Awaitable , collections.abc.Coroutine , collections.abc.AsyncIterable , and collections.abc.AsyncIterator .
  9. C API changes: new PyCoro_Type (exposed to Python as types.CoroutineType ) and PyCoroObject . PyCoro_CheckExact(*o) to test if o is a native coroutine .

While the list of changes and new things is not short, it is important to understand, that most users will not use these features directly. It is intended to be used in frameworks and libraries to provide users with convenient to use and unambiguous APIs with async def , await , async for and async with syntax.

Working example

All concepts proposed in this PEP are implemented [3] and can be tested.

import asyncio

async def echo_server():
    print('Serving on localhost:8000')
    await asyncio.start_server(handle_connection,
                               'localhost', 8000)

async def handle_connection(reader, writer):
    print('New connection...')

    while True:
        data = await reader.read(8192)

        if not data:
            break

        print('Sending {:.10}... back'.format(repr(data)))
        writer.write(data)

loop = asyncio.get_event_loop()
loop.run_until_complete(echo_server())
try:
    loop.run_forever()
finally:
    loop.close()

Acceptance

PEP 492 was accepted by Guido, Tuesday, May 5, 2015 [14] .

Implementation

The implementation is tracked in issue 24017 [15] . It was committed on May 11, 2015.

Acknowledgments

I thank Guido van Rossum, Victor Stinner, Elvis Pranskevichus, Andrew Svetlov, Ɓukasz Langa, Greg Ewing, Stephen J. Turnbull, Jim J. Jewett, Brett Cannon, Nick Coghlan, Steven D'Aprano, Paul Moore, Nathaniel Smith, Ethan Furman, Stefan Behnel, Paul Sokolovsky, Victor Petrovykh, and many others for their feedback, ideas, edits, criticism, code reviews, and discussions around this PEP.

Source: https://github.com/python/peps/blob/master/pep-0492.txt