[Python-ideas] PEP 525: Asynchronous Generators

Sat Aug 6 12:39:01 EDT 2016

On 2016-08-06 4:29 AM, Stefan Behnel wrote:

> Yury Selivanov schrieb am 03.08.2016 um 17:32:
>> On 2016-08-03 2:45 AM, Stefan Behnel wrote:
>>
>> The 'agen' generator, on the lowest level of generators implementation
>> will yield two things -- 'spam', and a wrapped 123 value.  Because
>> 123 is wrapped, the async generators machinery can distinguish async
>> yields from normal yields.
> This is actually going to be tricky to backport for Cython (which supports
> Py2.6+) since it seems to depend on a globally known C implemented wrapper
> object type. We'd have to find a way to share that across different
> packages and also across different Cython versions (types are only shared
> within the same Cython version). I guess we'd have to store a reference to
> that type in some well hidden global place somewhere, and then never touch
> its implementation again...

I don't think you need to care about these details.  You can
implement async generators using async iteration protocol as
it was defined in PEP 492.

You can compile a Cython async generator into an object that
implements `am_aiter` slot (which should return `self`) and
`am_anext` slot (which should return a coroutine-like object,
similar to what Cython compiles 'async def').  `am_anext`
coroutine should "return" (i.e. raise `StopIteration`) when
it reaches an "async yield" point, or raise
"StopAsyncIteration" when the generator is exhausted.

Essentially, because pure Python async generators work fine
with 'async for' and will continue to do so in 3.6, I think
there shouldn't be technical problems to add asynchronous
generators in Cython.

>
> Is that wrapper type going to be exposed anywhere in the Python visible
> world, or is it purely internal? (Not that I see a use case for making it
> visible to Python code...)
Yes, it's a purely internal C-level thing.  Python code will never
see it.

> BTW, why wouldn't "async yield from" work if the only distinction point is
> whether a yielded object is wrapped or not? That should work at any level
> of delegation, shouldn't it?

I can try ;)  Although when I was working on the PEP I had a
feeling that this won't work without a serious refactoring of
genobject.c.

>> ^^ In the above example, when the 'foo()' is instantiated, there
>> is no loop or finalizers set up at all.  BUT since a loop (or
>> coroutine wrapper) is required to iterate async generators, there
>> is a strong guarantee that it *will* present on the first iteration.
> Correct. And it also wouldn't help to generally extend the Async-Iterator
> protocol with an aclose() method because ending an (async-)for loop doesn't
> mean we are done with the async iterator, so this would just burden the
> users with unnecessary cleanup handling. That's unfortunate...

Well, we have the *exact same thing* with regular (synchronous)
iterators.

Let's say you have an iterable object (with `__iter__` and
`__next__`), which cleanups resources at the end of its
iteration.  If you didn't implement its __del__ properly
(or at all), the resources won't be cleaned-up when it's
partially iterated and then GCed.

>> Regarding "async gen itself should know how to cleanup" -- that's not
>> possible. async gen could just have an async with block and then
>> GCed (after being partially consumed).  Users won't expect to do
>> anything besides using try..finally or async with, so it's the
>> responsibility of the coroutine runner to cleanup async gen.  Hence
>> 'aclose' is a coroutine, and hence this set_asyncgen_finalizer API
>> for coroutine runners.
>>
>> This is indeed the most magical part of the proposal.  Although it's
>> important to understand that the regular Python users will likely
>> never encounter this in their life -- finalizers will be set up
>> by the framework they use (asyncio, Tornado, Twisted, you name it).
> I think my main problem is that you keep speaking about event loops (of
> which There Can Be Only One, by design), whereas coroutines are a much more
> general concept and I cannot overlook all possible forms of using them in
> the future. What I would like to avoid is the case where we globally
> require setting up one finalizer handler (or context), and then prevent
> users from doing their own cleanup handling in some module context
> somewhere. It feels to me like there should be some kind of stacking for
> this (which in turn feels like a context manager) in order to support
> adapters and the like that need to do their own cleanup handling (or
> somehow intercept the global handling), regardless of what else is running.

We can call the thing that runs coroutines a "coroutine runner".
I'll update the PEP.

Regardless of the naming issues, I don't see any potential problem
with how `set_asyncgen_finalizer` is currently defined:

1. Because a finalizer (set with `set_asyncgen_finalizer`) is
assigned to generators on their first iteration -- it's guaranteed
that each async generator will have a correct one attached to it.

2. It's extremely unlikely that somebody will design a system that
switches coroutine runners *while async/awaiting a coroutine*.
There are no good reasons to do this, and I doubt that it's even
a possible thing.  But even in this unlikely use case, you can
easily stack finalizers following this pattern:

   old_finalizer = sys.get_asyncgen_finalizer()
   sys.set_asyncgen_finalizer(my_finalizer)
   try:
     # do my thing
   finally:
     sys.set_asyncgen_finalizer(old_finalizer)

Thanks,
Yiry