[Python-Dev] PEP 550 v4

Sat Aug 26 02:34:29 EDT 2017

On Fri, Aug 25, 2017 at 3:32 PM, Yury Selivanov <yselivanov.ml at gmail.com> wrote:
> Coroutines and Asynchronous Tasks
> ---------------------------------
>
> In coroutines, like in generators, context variable changes are local
> and are not visible to the caller::
>
>     import asyncio
>
>     var = new_context_var()
>
>     async def sub():
>         assert var.lookup() == 'main'
>         var.set('sub')
>         assert var.lookup() == 'sub'
>
>     async def main():
>         var.set('main')
>         await sub()
>         assert var.lookup() == 'main'
>
>     loop = asyncio.get_event_loop()
>     loop.run_until_complete(main())

I think this change is a bad idea. I think that generally, an async
call like 'await async_sub()' should have the equivalent semantics to
a synchronous call like 'sync_sub()', except for the part where the
former is able to contain yields. Giving every coroutine an LC breaks
that equivalence. It also makes it so in async code, you can't
necessarily refactor by moving code in and out of subroutines. Like,
if we inline 'sub' into 'main', that shouldn't change the semantics,
but...

async def main():
    var.set('main')
    # inlined copy of sub()
    assert var.lookup() == 'main'
    var.set('sub')
    assert var.lookup() == 'sub'
    # end of inlined copy
    assert var.lookup() == 'main'   # fails

It also adds non-trivial overhead, because now lookup() is O(depth of
async callstack), instead of O(depth of (async) generator nesting),
which is generally much smaller.

I think I see the motivation: you want to make

   await sub()

and

   await ensure_future(sub())

have the same semantics, right? And the latter has to create a Task
and split it off into a new execution context, so you want the former
to do so as well? But to me this is like saying that we want

   sync_sub()

and

   thread_pool_executor.submit(sync_sub).result()

to have the same semantics: they mostly do, but sync_sub() access
thread-locals then they won't. Oh well. That's perhaps a but
unfortunate, but it doesn't mean we should give every synchronous
frame its own thread-locals.

(And fwiw I'm still not convinced we should give up on 'yield from' as
a mechanism for refactoring generators.)

> To establish the full semantics of execution context in couroutines,
> we must also consider *tasks*.  A task is the abstraction used by
> *asyncio*, and other similar libraries, to manage the concurrent
> execution of coroutines.  In the example above, a task is created
> implicitly by the ``run_until_complete()`` function.
> ``asyncio.wait_for()`` is another example of implicit task creation::
>
>     async def sub():
>         await asyncio.sleep(1)
>         assert var.lookup() == 'main'
>
>     async def main():
>         var.set('main')
>
>         # waiting for sub() directly
>         await sub()
>
>         # waiting for sub() with a timeout
>         await asyncio.wait_for(sub(), timeout=2)
>
>         var.set('main changed')
>
> Intuitively, we expect the assertion in ``sub()`` to hold true in both
> invocations, even though the ``wait_for()`` implementation actually
> spawns a task, which runs ``sub()`` concurrently with ``main()``.

I found this example confusing -- you talk about sub() and main()
running concurrently, but ``wait_for`` blocks main() until sub() has
finished running, right? Is this just supposed to show that there
should be some sort of inheritance across tasks, and then the next
example is to show that it has to be a copy rather than sharing the
actual object? (This is just a issue of phrasing/readability.)

> The ``sys.run_with_logical_context()`` function performs the following
> steps:
>
> 1. Push *lc* onto the current execution context stack.
> 2. Run ``func(*args, **kwargs)``.
> 3. Pop *lc* from the execution context stack.
> 4. Return or raise the ``func()`` result.

It occurs to me that both this and the way generator/coroutines expose
their logic context means that logical context objects are
semantically mutable. This could create weird effects if someone
attaches the same LC to two different generators, or tries to use it
simultaneously in two different threads, etc. We should have a little
interlock like generator's ag_running, where an LC keeps track of
whether it's currently in use and if you try to push the same LC onto
two ECs simultaneously then it errors out.

> For efficient access in performance-sensitive code paths, such as in
> ``numpy`` and ``decimal``, we add a cache to ``ContextVar.get()``,
> making it an O(1) operation when the cache is hit.  The cache key is
> composed from the following:
>
> * The new ``uint64_t PyThreadState->unique_id``, which is a globally
>   unique thread state identifier.  It is computed from the new
>   ``uint64_t PyInterpreterState->ts_counter``, which is incremented
>   whenever a new thread state is created.
>
> * The ``uint64_t ContextVar->version`` counter, which is incremented
>   whenever the context variable value is changed in any logical context
>   in any thread.

I'm pretty sure you need to also invalidate on context push/pop. Consider:

def gen():
    var.set("gen")
    var.lookup()  # cache now holds "gen"
    yield
    print(var.lookup())

def main():
    var.set("main")
    g = gen()
    next(g)
    # This should print "main", but it's the same thread and the last
call to set() was
    # the one inside gen(), so we get the cached "gen" instead
    print(var.lookup())
    var.set("no really main")
    var.lookup()  # cache now holds "no really main"
    next(g)  # should print "gen" but instead prints "no really main"

> The cache is then implemented as follows::
>
>     class ContextVar:
>
>         def set(self, value):
>             ...  # implementation
>             self.version += 1
>
>
>         def get(self):

I think you missed a s/get/lookup/ here :-)

-n

-- 
Nathaniel J. Smith -- https://vorpus.org