[Python-Dev] PEP 550 v4: coroutine policy

Tue Aug 29 16:06:44 EDT 2017

On Tue, Aug 29, 2017 at 12:32 PM, Antoine Pitrou <antoine at python.org> wrote:
>
>
> Le 29/08/2017 à 21:18, Yury Selivanov a écrit :
>> On Tue, Aug 29, 2017 at 2:40 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
>>> On Mon, 28 Aug 2017 17:24:29 -0400
>>> Yury Selivanov <yselivanov.ml at gmail.com> wrote:
>>>> Long story short, I think we need to rollback our last decision to
>>>> prohibit context propagation up the call stack in coroutines.  In PEP
>>>> 550 v3 and earlier, the following snippet would work just fine:
>>>>
>>>>    var = new_context_var()
>>>>
>>>>    async def bar():
>>>>        var.set(42)
>>>>
>>>>    async def foo():
>>>>        await bar()
>>>>        assert var.get() == 42   # with previous PEP 550 semantics
>>>>
>>>>    run_until_complete(foo())
>>>>
>>>> But it would break if a user wrapped "await bar()" with "wait_for()":
>>>>
>>>>    var = new_context_var()
>>>>
>>>>    async def bar():
>>>>        var.set(42)
>>>>
>>>>    async def foo():
>>>>        await wait_for(bar(), 1)
>>>>        assert var.get() == 42  # AssertionError !!!
>>>>
>>>>    run_until_complete(foo())
>>>>
>>> [...]
>>
>>> Why wouldn't the bar() coroutine inherit
>>> the LC at the point it's instantiated (i.e. where the synchronous bar()
>>> call is done)?
>>
>> We want tasks to have their own isolated contexts.  When a task
>> is started, it runs its code in parallel with its "parent" task.
>
> I'm sorry, but I don't understand what it all means.
>
> To pose the question differently: why is example #1 supposed to be
> different, philosophically, than example #2?  Both spawn a coroutine,
> both wait for its execution to end.  There is no reason that adding a
> wait_for() intermediary (presumably because the user wants to add a
> timeout) would significantly change the execution semantics of bar().
>
>> wait_for() in the above example creates an asyncio.Task implicitly,
>> and that's why we don't see 'var' changed to '42' in foo().
>
> I don't understand why a non-obvious behaviour detail (the fact that
> wait_for() creates an asyncio.Task implicitly) should translate into a
> fundamental difference in observable behaviour.  I find it
> counter-intuitive and error-prone.

For better or worse, asyncio users generally need to be aware of the
distinction between coroutines/Tasks/Futures and which functions
create or return which -- it's essentially the same as the distinction
between running some code in the current thread versus spawning a new
thread to run it (and then possibly waiting for the result).

Mostly the docs tell you when a function converts a coroutine into a
Task, e.g. if you look at the docs for 'ensure_future' or 'wait_for'
or 'wait' they all say this explicitly. Or in some cases like 'gather'
and 'shield', it's implicit because they take arbitrary futures, and
creating a task is how you convert a coroutine into a future.

As a rule of thumb, I think it's accurate to say that any function
that takes a coroutine object as an argument always converts it into a
Task.

>> This is a slightly complicated case, but it's addressable with a good
>> documentation and recommended best practices.
>
> It would be better addressed with consistent behaviour that doesn't rely
> on specialist knowledge, though :-/

This is the core of the Curio/Trio critique of asyncio: in asyncio,
operations that implicitly initiate concurrent execution are all over
the API. This is the root cause of asyncio's problems with buffering
and backpressure, it makes it hard to shut down properly (it's hard to
know when everything has finished running), it's related to the
"spooky cancellation at a distance" issue where cancelling one task
can cause another Task to get a cancelled exception, etc. If you use
the recommended "high level" API for streams, then AFAIK it's still
impossible to close your streams properly at shutdown (you can request
that a close happen "sometime soon", but you can't tell when it's
finished).

Obviously asyncio isn't going anywhere, so we should try to
solve/mitigate these issues where we can, but asyncio's API
fundamentally assumes that users will be very aware and careful about
which operations create which kinds of concurrency. So I sort of feel
like, if you can use asyncio at all, then you can handle wait_for
creating a new LC.

-n

[1] https://vorpus.org/blog/some-thoughts-on-asynchronous-api-design-in-a-post-asyncawait-world/#bug-3-closing-time

-- 
Nathaniel J. Smith -- https://vorpus.org