Notice: While Javascript is not essential for this website, your interaction with the content will be limited. Please turn Javascript on for the full experience.

PEP 567 -- Context Variables

PEP:567
Title:Context Variables
Author:Yury Selivanov <yury at magic.io>
Status:Draft
Type:Standards Track
Created:12-Dec-2017
Python-Version:3.7
Post-History:12-Dec-2017, 28-Dec-2017

Abstract

This PEP proposes a new contextvars module and a set of new CPython C APIs to support context variables. This concept is similar to thread-local storage (TLS), but, unlike TLS, it also allows correctly keeping track of values per asynchronous task, e.g. asyncio.Task.

This proposal is a simplified version of PEP 550. The key difference is that this PEP is concerned only with solving the case for asynchronous tasks, not for generators. There are no proposed modifications to any built-in types or to the interpreter.

This proposal is not strictly related to Python Context Managers. Although it does provide a mechanism that can be used by Context Managers to store their state.

Rationale

Thread-local variables are insufficient for asynchronous tasks that execute concurrently in the same OS thread. Any context manager that saves and restores a context value using threading.local() will have its context values bleed to other code unexpectedly when used in async/await code.

A few examples where having a working context local storage for asynchronous code is desirable:

  • Context managers like decimal contexts and numpy.errstate.
  • Request-related data, such as security tokens and request data in web applications, language context for gettext, etc.
  • Profiling, tracing, and logging in large code bases.

Introduction

The PEP proposes a new mechanism for managing context variables. The key classes involved in this mechanism are contextvars.Context and contextvars.ContextVar. The PEP also proposes some policies for using the mechanism around asynchronous tasks.

The proposed mechanism for accessing context variables uses the ContextVar class. A module (such as decimal) that wishes to store a context variable should:

  • declare a module-global variable holding a ContextVar to serve as a key;
  • access the current value via the get() method on the key variable;
  • modify the current value via the set() method on the key variable.

The notion of "current value" deserves special consideration: different asynchronous tasks that exist and execute concurrently may have different values for the same key. This idea is well-known from thread-local storage but in this case the locality of the value is not necessarily bound to a thread. Instead, there is the notion of the "current Context" which is stored in thread-local storage, and is accessed via contextvars.copy_context() function. Manipulation of the current Context is the responsibility of the task framework, e.g. asyncio.

A Context is conceptually a read-only mapping, implemented using an immutable dictionary. The ContextVar.get() method does a lookup in the current Context with self as a key, raising a LookupError or returning a default value specified in the constructor.

The ContextVar.set(value) method clones the current Context, assigns the value to it with self as a key, and sets the new Context as the new current Context.

Specification

A new standard library module contextvars is added with the following APIs:

  1. copy_context() -> Context function is used to get a copy of the current Context object for the current OS thread.
  2. ContextVar class to declare and access context variables.
  3. Context class encapsulates context state. Every OS thread stores a reference to its current Context instance. It is not possible to control that reference manually. Instead, the Context.run(callable, *args, **kwargs) method is used to run Python code in another context.

contextvars.ContextVar

The ContextVar class has the following constructor signature: ContextVar(name, *, default=_NO_DEFAULT). The name parameter is used only for introspection and debug purposes, and is exposed as a read-only ContextVar.name attribute. The default parameter is optional. Example:

# Declare a context variable 'var' with the default value 42.
var = ContextVar('var', default=42)

(The _NO_DEFAULT is an internal sentinel object used to detect if the default value was provided.)

ContextVar.get() returns a value for context variable from the current Context:

# Get the value of `var`.
var.get()

ContextVar.set(value) -> Token is used to set a new value for the context variable in the current Context:

# Set the variable 'var' to 1 in the current context.
var.set(1)

ContextVar.reset(token) is used to reset the variable in the current context to the value it had before the set() operation that created the token:

assert var.get(None) is None

token = var.set(1)
try:
    ...
finally:
    var.reset(token)

assert var.get(None) is None

ContextVar.reset() method is idempotent and can be called multiple times on the same Token object: second and later calls will be no-ops.

contextvars.Token

contextvars.Token is an opaque object that should be used to restore the ContextVar to its previous value, or remove it from the context if the variable was not set before. It can be created only by calling ContextVar.set().

For debug and introspection purposes it has:

  • a read-only attribute Token.var pointing to the variable that created the token;
  • a read-only attribute Token.old_value set to the value the variable had before the set() call, or to Token.MISSING if the variable wasn't set before.

Having the ContextVar.set() method returning a Token object and the ContextVar.reset(token) method, allows context variables to be removed from the context if they were not in it before the set() call.

contextvars.Context

Context object is a mapping of context variables to values.

Context() creates an empty context. To get a copy of the current Context for the current OS thread, use the contextvars.copy_context() method:

ctx = contextvars.copy_context()

To run Python code in some Context, use Context.run() method:

ctx.run(function)

Any changes to any context variables that function causes will be contained in the ctx context:

var = ContextVar('var')
var.set('spam')

def function():
    assert var.get() == 'spam'

    var.set('ham')
    assert var.get() == 'ham'

ctx = copy_context()

# Any changes that 'function' makes to 'var' will stay
# isolated in the 'ctx'.
ctx.run(function)

assert var.get() == 'spam'

Any changes to the context will be contained in the Context object on which run() is called on.

Context.run() is used to control in which context asyncio callbacks and Tasks are executed. It can also be used to run some code in a different thread in the context of the current thread:

executor = ThreadPoolExecutor()
current_context = contextvars.copy_context()

executor.submit(
    lambda: current_context.run(some_function))

Context objects implement the collections.abc.Mapping ABC. This can be used to introspect context objects:

ctx = contextvars.copy_context()

# Print all context variables and their values in 'ctx':
print(ctx.items())

# Print the value of 'some_variable' in context 'ctx':
print(ctx[some_variable])

asyncio

asyncio uses Loop.call_soon(), Loop.call_later(), and Loop.call_at() to schedule the asynchronous execution of a function. asyncio.Task uses call_soon() to run the wrapped coroutine.

We modify Loop.call_{at,later,soon} and Future.add_done_callback() to accept the new optional context keyword-only argument, which defaults to the current context:

def call_soon(self, callback, *args, context=None):
    if context is None:
        context = contextvars.copy_context()

    # ... some time later
    context.run(callback, *args)

Tasks in asyncio need to maintain their own context that they inherit from the point they were created at. asyncio.Task is modified as follows:

class Task:
    def __init__(self, coro):
        ...
        # Get the current context snapshot.
        self._context = contextvars.copy_context()
        self._loop.call_soon(self._step, context=self._context)

    def _step(self, exc=None):
        ...
        # Every advance of the wrapped coroutine is done in
        # the task's context.
        self._loop.call_soon(self._step, context=self._context)
        ...

C API

  1. PyContextVar * PyContextVar_New(char *name, PyObject *default): create a ContextVar object.

  2. int PyContextVar_Get(PyContextVar *, PyObject *default_value, PyObject **value): return -1 if an error occurs during the lookup, 0 otherwise. If a value for the context variable is found, it will be set to the value pointer. Otherwise, value will be set to default_value when it is not NULL. If default_value is NULL, value will be set to the default value of the variable, which can be NULL too. value is always a new reference.

  3. PyContextToken * PyContextVar_Set(PyContextVar *, PyObject *): set the value of the variable in the current context.

  4. PyContextVar_Reset(PyContextVar *, PyContextToken *): reset the value of the context variable.

  5. PyContext * PyContext_New(): create a new empty context.

  6. PyContext * PyContext_Copy(): get a copy of the current context.

  7. int PyContext_Enter(PyContext *) and int PyContext_Exit(PyContext *) allow to set and restore the context for the current OS thread. It is required to always restore the previous context:

    PyContext *old_ctx = PyContext_Copy();
    if (old_ctx == NULL) goto error;
    
    if (PyContext_Enter(new_ctx)) goto error;
    
    // run some code
    
    if (PyContext_Exit(old_ctx)) goto error;
    

Implementation

This section explains high-level implementation details in pseudo-code. Some optimizations are omitted to keep this section short and clear.

For the purposes of this section, we implement an immutable dictionary using dict.copy():

class _ContextData:

    def __init__(self):
        self._mapping = dict()

    def get(self, key):
        return self._mapping[key]

    def set(self, key, value):
        copy = _ContextData()
        copy._mapping = self._mapping.copy()
        copy._mapping[key] = value
        return copy

    def delete(self, key):
        copy = _ContextData()
        copy._mapping = self._mapping.copy()
        del copy._mapping[key]
        return copy

Every OS thread has a reference to the current _ContextData. PyThreadState is updated with a new context_data field that points to a _ContextData object:

class PyThreadState:
    context_data: _ContextData

contextvars.copy_context() is implemented as follows:

def copy_context():
    ts : PyThreadState = PyThreadState_Get()

    if ts.context_data is None:
        ts.context_data = _ContextData()

    ctx = Context()
    ctx._data = ts.context_data
    return ctx

contextvars.Context is a wrapper around _ContextData:

class Context(collections.abc.Mapping):

    def __init__(self):
        self._data = _ContextData()

    def run(self, callable, *args, **kwargs):
        ts : PyThreadState = PyThreadState_Get()
        saved_data : _ContextData = ts.context_data

        try:
            ts.context_data = self._data
            return callable(*args, **kwargs)
        finally:
            self._data = ts.context_data
            ts.context_data = saved_data

    # Mapping API methods are implemented by delegating
    # `get()` and other Mapping calls to `self._data`.

contextvars.ContextVar interacts with PyThreadState.context_data directly:

class ContextVar:

    def __init__(self, name, *, default=_NO_DEFAULT):
        self._name = name
        self._default = default

    @property
    def name(self):
        return self._name

    def get(self, default=_NO_DEFAULT):
        ts : PyThreadState = PyThreadState_Get()
        data : _ContextData = ts.context_data

        try:
            return data.get(self)
        except KeyError:
            pass

        if default is not _NO_DEFAULT:
            return default

        if self._default is not _NO_DEFAULT:
            return self._default

        raise LookupError

    def set(self, value):
        ts : PyThreadState = PyThreadState_Get()
        data : _ContextData = ts.context_data

        try:
            old_value = data.get(self)
        except KeyError:
            old_value = Token.MISSING

        ts.context_data = data.set(self, value)
        return Token(self, old_value)

    def reset(self, token):
        if token._used:
            return

        if token._old_value is Token.MISSING:
            ts.context_data = data.delete(token._var)
        else:
            ts.context_data = data.set(token._var,
                                       token._old_value)

        token._used = True


class Token:

    MISSING = object()

    def __init__(self, var, old_value):
        self._var = var
        self._old_value = old_value
        self._used = False

    @property
    def var(self):
        return self._var

    @property
    def old_value(self):
        return self._old_value

Implementation Notes

  • The internal immutable dictionary for Context is implemented using Hash Array Mapped Tries (HAMT). They allow for O(log N) set operation, and for O(1) copy_context() function, where N is the number of items in the dictionary. For a detailed analysis of HAMT performance please refer to PEP 550 [1].
  • ContextVar.get() has an internal cache for the most recent value, which allows to bypass a hash lookup. This is similar to the optimization the decimal module implements to retrieve its context from PyThreadState_GetDict(). See PEP 550 which explains the implementation of the cache in a great detail.

Summary of the New APIs

  • A new contextvars module with ContextVar, Context, and Token classes, and a copy_context() function.
  • asyncio.Loop.call_at(), asyncio.Loop.call_later(), asyncio.Loop.call_soon(), and asyncio.Future.add_done_callback() run callback functions in the context they were called in. A new context keyword-only parameter can be used to specify a custom context.
  • asyncio.Task is modified internally to maintain its own context.

Design Considerations

Why contextvars.Token and not ContextVar.unset()?

The Token API allows to get around having a ContextVar.unset() method, which is incompatible with chained contexts design of PEP 550. Future compatibility with PEP 550 is desired (at least for Python 3.7) in case there is demand to support context variables in generators and asynchronous generators.

The Token API also offers better usability: the user does not have to special-case absence of a value. Compare:

token = cv.get()
try:
    cv.set(blah)
    # code
finally:
    cv.reset(token)

with:

_deleted = object()
old = cv.get(default=_deleted)
try:
    cv.set(blah)
    # code
finally:
    if old is _deleted:
        cv.unset()
    else:
        cv.set(old)

Rejected Ideas

Replication of threading.local() interface

Please refer to PEP 550 where this topic is covered in detail: [2].

Backwards Compatibility

This proposal preserves 100% backwards compatibility.

Libraries that use threading.local() to store context-related values, currently work correctly only for synchronous code. Switching them to use the proposed API will keep their behavior for synchronous code unmodified, but will automatically enable support for asynchronous code.

Reference Implementation

The reference implementation can be found here: [3].

Source: https://github.com/python/peps/blob/master/pep-0567.rst