[Python-ideas] async/await in Python

Yann Kaiser kaiser.yann at gmail.com
Tue Apr 21 09:28:20 CEST 2015


On Tue, 21 Apr 2015 at 00:07 Yury Selivanov <yselivanov.ml at gmail.com> wrote:

> Hi Yann,
>
>
> Thanks for the feedback! Answers below.
>

Thanks for hearing it!


> On 2015-04-20 5:39 PM, Yann Kaiser wrote:
> > Browsing the PEP and this thread, I can't help but notice there is little
> > mention of the possibility (or rather, the lack thereof in the current
> > version of the proposal) of writing async generators(ie. usable in async
> > for/await for) within the newly proposed coroutines other than "yield
> > [from] is not allowed within an async function".
> >
> > I think this should be given more thought, despite the technical
> > difficulties involved (namely, new-style coroutines are still implemented
> > as generators).
> >
> > While the proposal already makes it less of a burden to write
> asynchronous
> > iterables through "async for", __aiter__ and __anext__, in synchronous
> > systems writing a generator function(or having one as __iter__) is most
> > often preferred over writing a class with __iter__ and __next__ methods.
> >
> > A use case I've come across using asyncio, assuming yield could be
> somehow
> > allowed within a new-style coroutine:
> >
> > A web API provides paginated content with few filtering options. You have
> > something that fetches those pages:
> >
> >      async def fetch_pages(url, params):
> >          while True:
> >              page = await(request(url, params))
> >              feed = page.decode_json()
> >              if not feed['items']: # no items when we reach the last page
> >                  return
> >              yield feed
> >
> > It would be sorta okay to have this as a class with __aiter__ and
> __anext__
> > methods. It would be necessary anyway.
> >
> > Now we want an iterator for every item. We won't really want to iterate
> on
> > pages anyway, we'll want to iterate on items. With yield available, this
> is
> > really easy!
> >
> >      async def fetch_items(pages):
> >          async for page in pages:
> >              yield from page['items']
> >
> > Without, it becomes more difficult
> >
> >      class ItemFetcher:
> >          def __init__(self, pages):
> >              self.pages = pages
> >
> >          async def __aiter__(self):
> >              self.pages_itor = await self.pages.__aiter__()
> >              self.items = iter(())
> >              return self
> >
> >          async def __anext__(self):
> >              try:
> >                  try:
> >                      return next(self.page)
> >                  except StopIteration:
> >                      page = await self.pages_itor.__anext__()
> >                      self.page = iter(page['items'])
> >                      return next(self.page)
> >              except StopIteration as exc:
> >                  raise StopAsyncIteration(StopIteration.value) from exc
> >
> > Whoa there! This is complicated and difficult to understand. But we
> really
> > want to iterate on items so we leave it in anyway.
>
> While I understand what you're trying to achieve here, the
> code doesn't look correct.  Could you please clone the ref
> implementation and make it work first?
>

Hadn't realized there was one. Will do.


> >
> > Here the language failed already. It made a monster out of what could
> have
> > been expressed simply.
> >
> > What if we only want new items?
> >
> >      async def new_items(items):
> >          async for item in items:
> >              if is_new_item(item):
> >                  yield item
> >
> > Without yield:
> >
> >      class NewItems:
> >          def __init__(self, items):
> >              self.items = items
> >
> >          async def __aiter__(self):
> >              self.itor = await self.items.__aiter__()
> >              return self
> >
> >          async def __anext__(self):
> >              async for item in self.itor:
> >                  if is_new_item(item):
> >                      return item
> >              raise StopAsyncIteration
> >
> > This isn't as bad as the previous example, but I'll be the first to admit
> > I'll use "if not is_new_item(item): continue" in client code instead.
> >
> > In bullet point form, skipping the possibility of having yield within
> > coroutine functions causes the following:
> >
> > * To write async iterators, you have to use the full async-iterable class
> > form.
> > * Iterators written in class form can't have a for loop over their
> > arguments, because their behavior is spread over multiple methods.
> > * Iterators are unwieldy when used manually
> > * Async iterators even more so
> > => If you want to make an async iterator, it's complicated
> > => If you want to make an async iterator that iterates over an iterator,
> > it's more complicated
> > => If you want to make an async iterator that iterates over an async
> > iterator, it's even more complicated
> >
> > I therefore think a way to have await and yield coexist should be looked
> > into.
> >
> > Solutions include adding new bytecodes or adding a flag to the
> YIELD_VALUE
> > and YIELD_FROM bytecodes.
> >
> > -- Yann
>
>
> All in all, I understand your case.
>
> But also, I know that you have this case because you're
> trying to think in generators and how to combine them. Which is
> a good pattern, but unfortunately, this pattern had never
> worked with generator-based coroutines either.
>

It was, sort of. I wrote a decorator that did it using a wrapper around
results to be iterated on, thus multiplexing the use of the "yield channel"
between Futures and Results. Without some form of "asynchronous for loop"
that is now being proposed, the looping side has to yield each value it
gets from the iterable, and that yield can potentially raise an alternate
StopIteration (eg. when you only know you've run out of items after making
a request)

Here's an example that used both looping and yielding values:

    @asyncio_generator
    def __iter__(self, Result):
        for f in self.iter_pages():
            listing = yield f
            for link in listing:
                yield Result(link)

Tangentially, this could be:

    async def __aiter__(self):
        async for listing in self.iter_pages():
            for link in listing:
                yield link

Anyway, if you want to look behind the curtains...

    def asyncio_generator(func):
        def wrapped(*args, **kwargs):
            class Result(object):
                def __init__(self, val):
                    self.val = val
            gen = func(*args, Result=Result, **kwargs)
            return _AsyncioGeneratorIterator(gen, Result)
        return wrapped


    class LateStopIteration(RuntimeError):
        def __str__(self):
            return "Attempted to stop iteration after non-result"


    class _AsyncioGeneratorIterator(object):
        def __init__(self, gen, result_cls):
            self.gen = gen
            self.result_cls = result_cls

        def __iter__(self):
            return self

        def __next__(self):
            val = next(self.gen)
            return self.process_val(val)

        @asyncio.coroutine
        def process_val(self, val):
            while not isinstance(val, self.result_cls):
                try:
                    try:
                        data = yield from val
                    except BaseException as exc:
                        val = self.gen.throw(*sys.exc_info())
                    else:
                        val = self.gen.send(data)
                except StopIteration as exc:
                    raise LateStopIteration from exc
            return val.val

        @asyncio.coroutine
        def as_list(self):
            lis = []
            for fut in self:
                try:
                    val = yield from fut
                except LateStopIteration:
                    break
                lis.append(val)
            return lis

It looks like a hack, and that's because it is one :-)
It passes a "Result" class as keyword argument to the decorated function
when called and expects it to yield either coroutines or Result instance.
I'm pretty sure there was a reason I couldn't just yield from those coros
inside the decorated function. It's been a while...

Imagine that there were no PEP 492.  That you only can use
> asyncio and 'yield from'.  You would design your code
> in a different way then.
>
> I spent a great deal of time thinking if it's possible to
> combine yields and awaits in one coroutine function.
>
> Unfortunately, even if it is possible, it will require horrible
> hacks and will complicate the implementation tremendously.
>
> Moreover, I think that combining 'yield' and 'yield from'
> expressions with 'await' will only create confusion and contradicts
> with the main intent of the PEP, which is to *remove that confusion*.
>
> I'd prefer to make PEP 492 maximally minimal in this regard.
> Since it's prohibited to use 'yield' in coroutines, we may
> allow it in later Python versions.  (Although I'm certain, that
> we'll need a new keyword for 'yield' in coroutines.)
>

I think all in all it could be upgraded to work with the proposal, thus
eliminating the need to yield each value we get from the iterator.


> At this point, PEP 492 is created *specifically* to address
> existing pain-points that asyncio developers have, no more.

I don't want to create new programming patterns or new concepts
> of generator-coroutine hybrid.  I think that can and should be
> covered in future PEPs in future Pythons.
>

Well... I didn't make this up out of thin air :-)


> If someone smarter than me can figure out a way to do this
> in a non-confusing way that won't require duplicating genobject.c
> and fourth of ceval.c I'd be glad to update the PEP.


Well the above strategy could be imitated, multiplexing the yield channel
between yields meant for the event loop and those meant for the
asynchronous iteration protocol. It could be done with an additional
parameter to YIELD_VALUE (I'm assuming await compiles down to it or
YIELD_FROM ?), although it would be redundant for functions not using both
concepts at the same time. Still, it's just one byte.
YIELD_FROM... my head explosion warnings are also being set off.

As Greg pointed out, the presence of yield in an async function would have
it return an object also supporting __aiter__ and __anext__.



Thanks,
> Yury
>
>
Yann


> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150421/9dddd435/attachment-0001.html>


More information about the Python-ideas mailing list