[Web-SIG] Draft 2: WSGI Response Upgrade Bridging

PJ Eby pje at telecommunity.com
Tue Oct 7 01:15:49 CEST 2014


Based on last week's public and private feedback on my "native server
APIs" pre-PEP, I've done an almost complete rewrite of the previous
draft, in order to provide *concrete examples* of the proposal in use,
along with code samples for Django and WebOb, as well as both HTTP/2
and Websockets.

The rationale has also been overhauled, and there is a new "Next
Steps" section, plus a new discussion of how .close() is affected.  In
addition, there is a new explanation and example of how to use this
proposal to build future standardized APIs atop currently-available
native APIs, using straightforward middleware.

As before, you can find a "living" HTML version of the draft in progress at:

   https://gist.github.com/pjeby/62e3892cd75257518eb0

(In addition to nice formatting, it also has a clickable table of contents.)

After the next round of feedback, I plan to convert this to reST and
get a PEP number assigned -- assuming nobody comes up with a killer
problem that sends me back to the drawing board, of course.  ;-)


# WSGI Response Upgrade Bridging

### Contents

* Overview
  * The Problems
  * The Proposed Solution
  * Example Usage Scenarios
    * Example 1: HTTP/2 Response Pushing from inside Django
    * Example 2: Websocket Chat from inside a WebOb-based Framework
  * Proposal Scope
* Specification
  * Providing an API
    * Response Key Details
    * Closing and Resource Management
  * Accessing an API
  * Intercepting, Disabling, or Upgrading API Bridges
* Next Steps
  * Open Questions and Issues
  * Notes on the Current Design Rationale
* Acknowledgements
* References
* Copyright


# Overview

## The Problems

Current Python web frameworks and applications are built mostly on
WSGI: a request/response API based on HTTP/0.9's simple
request-per-connection model.  Web libraries and frameworks offer a
wide variety of services for request routing, session management,
authentication and authorization, etc., based on this model and
working with WSGI.

Modern web protocols, however, including Websockets, HTTP/2, SPDY, and
so on, are based on a more sophisticated communication model that
*doesn't* fit very well within WSGI.  (For that matter, WSGI doesn't
play well with Twisted or `asyncio`-style asynchronous APIs, either.)

Other web API standards have been proposed or are in the process of
being developed, but they are between a rock and a hard place, in that
if they aim for compatibility with WSGI, then it is harder to provide
new features, but if they focus on providing new features, then
compatibility with existing frameworks, middleware, etc. is limited.

At the same time, server developers are stuck in something of a
holding pattern.  Their servers may have (or want to add) new
features, but what if they invest in a proposed API that doesn't pan
out?  Conversely, what if they get stuck needing to support multiple
APIs?

Meanwhile, application developers face their own dilemma as well:
since existing Websocket and HTTP/2 APIs cannot be easily (and
compatibly) accessed from within WSGI, they are unable to use their
application and frameworks' existing code for managing routing,
sessions, authentication, authorization, etc., when making use of
either Websockets or HTTP/2.  Instead, they must duplicate code, or
else use sideband communications (e.g. via redis) to link between a
server with the needed API and the main code of their application.

But what if we could cut through *all three* of these dilemmas, in a
way that would let us have our existing framework "cake", and get to
"eat" our advanced protocols, too?

## The Proposed Solution

Since the majority of existing WSGI framework and middleware tools
deal mainly with the WSGI *request*, what if we could keep using WSGI
to handle our requests, but use a *different* API for the responses?

In fact, what if we could use that different API, *only* for the
responses that actually needed it, on a request-by-request basis?
That way, for example, we could still use our existing middleware or
framework code to make sure that a session has been established,
authentication and authorization have been handled, and so on.

Then, our existing framework and app code could send their existing
login redirects and error pages.  But, once everything is logged in
and ready to go, we could finally switch over to that other API, to
send the *real* response -- and still have access to our user objects,
routing parameters, etc., within that other API.

And this "real" response wouldn't have to be a single HTTP response,
either.  It could be a handler of some kind, sending or receiving
packets of information via websockets, HTTP/2 push, the `asyncio` API,
or whatever other specialized response APIs are available in the WSGI
environment.

What's more, if we could ask for these "other APIs" by *name*, then we
could begin using these other APIs today, right now...  *and* still
define standardized Python APIs for these features later.  And,
developers of these other APIs wouldn't have to convince people to
switch away from WSGI, nor struggle to come up with clever ways to
"tunnel" their APIs through WSGI in a compatible way.

Therefore, this PEP proposes a mechanism akin to HTTP's `Upgrade:`
process, to allow an existing web framework and/or middleware to
handle the initial incoming HTTP request and select an
application/controller/view/etc., invoking it with information
obtained from the request.

Then, when it's time to respond to the request, the running
application can choose to upgrade or "bridge" to using a more advanced
API to handle the response (and possibly continue to manage an ongoing
connection, depending on the nature of the protocols involved).

(But, if the request doesn't *need* any special handling, the
application can simply issue a standard WSGI response, however it
currently does that.  So only the parts of an application that *need*
this special handling ever have to use it.)


## Example Usage Scenarios

Below are two code samples, showing different use cases, different
frameworks, and different "upgraded" APIs.  In each case, there is an
outer piece of framework-specific code (the **request handler**), and
an inner piece of non-framework, API-specific code (the **response
handler**).

To link the two handlers, a small bit of bridging code (shown in these
examples as `request.upgrade_to()`) is used to request a desired API
by name, register the response handler, and return a **bridging
response**: a special WSGI response that tells the server to invoke
the response handler in its place.

Please note, however, that these are *use case illustrations* only.
This proposal does not specify *any* of the APIs shown in these
examples, including the `request.upgrade_to()` method itself!

Also, depending on the framework and API involved, the request and
response handlers could be functions, methods, instances, classes, or
something else altogether.  A framework might not provide an
`upgrade_to()` API of its own (or spell it differently) and an
application developer always has the option of creating their own
version of it as a utility function.  (An example implementation will
also be shown later in this spec.)


### Example 1: HTTP/2 Response Pushing from inside Django

    def main_view(request):
        def http2_handler(server):
            server.push(path='/css/myApp.css', ...)
            server.push(path='/js/myApp.js', ...)
            server.send_response(status=200,
                           headers = [('content-type', 'text/plain')],
                           body='Hello world!'.encode('ascii'))
        return request.upgrade_to('http2', http2_handler)

This example shows a relatively simple use case: adding pushed files
to an HTTP response.  The assumption here is that any routing,
authentication, etc. have been handled by Django by the time the above
code runs, and so it just needs to send a response using some
non-WSGI/non-Django API: a hypothetical API named `http2`.

The hypothetical `request.upgrade_to(api_name, *args, **kw)` method
takes a desired API name, looks it up in the WSGI environment, and
invokes it to create a **bridging response**: a special response that
tells the WSGI server to use the registered response handler to
perform the response, bypassing any middleware that doesn't alter or
replace this response.

(Again, please note that the actual `http2` API shown is a purely
hypothetical illustration, loosely based on the [nghttp2] API; this
proposal only covers the *behavior* of `request.upgrade_to()`, and not
its existence or spelling, let alone the behavior of Django or
`nghttp2`.)


### Example 2: Websocket Chat from inside a WebOb-based Framework

This next example is more complex, demonstrating how response upgrade
bridging can be used to switch to a "conversational" or
packet-oriented protocol such as Websockets:

    @someframework.route('/chat/:room_id'):   # route to the request handler
    def chat(self, request, room_id):
        # code here looks up room, user, etc.
        # can redirect to login/registration
        # validate room existence, etc.
        # using the web framework's request and other tools
        ...
        # Ready to chat? Define a handler for the websocket API:
        def websocket_handler(sock):
            # code here has access to request/room
            # *plus* whatever it gets passed by the websocket API

            sock.send("Welcome to the %s room, %s" % (room.name, user.name))
            room.sockets[user.name] = sock

            def sendall(msg):
                data = msg.encode('utf8')
                for s in room.sockets.values():
                    s.send(data)

            sendall("%s has entered the chat room" % user.name)

            @sock.on_receive
            def receive_handler(data):
                sendall("%s: %s" % user.name, data.decode('utf8'))

            @sock.on_close
            def close_handler():
                if room.sockets.get(user.name) is sock:
                    delete room.sockets[user.name]

            # etc...

        return request.upgrade_to('websockets', websocket_handler)

Again, note that this `websockets` API is purely hypothetical; the
point of this illustration is merely to show that response-upgrade
bridging isn't limited to synchronous control flow or a single
request-response pair.  Upgraded response APIs can be event driven,
callback-based, generator-oriented, or almost anything at all.

So, while both of these examples show:

1. An outer function, used as a **request handler**
2. An inner function, used as a **response handler**, and
3. A `request.upgrade_to()` function, used to register the response
handler and generate a **bridging response**

Please note again that *none* of these three parts have to be
implemented in the ways shown above.  The request handler could have
been a class, instance, or method, depending on the web framework in
use, and the same is true for the response handler, depending on the
API being bridged to.  (And, as previously mentioned,
`request.upgrade_to()` is a short bit of glue code that can be written
by hand.)


## Proposal Scope

Goals of this proposal include:

1. Defining a way for WSGI applications, at runtime (i.e., during the
execution of a request), to detect the existence of, and access,
upgraded non-WSGI server APIs which can be used in place of WSGI for
either effecting a response to the current request, or initiating a
more advanced communications protocol (such as websocket connections,
associated content pushing, etc.) as an upgrade to the current
request.

2. Defining ways for WSGI middleware to:

  1. Continue to be used for request routing and other pre-response
activities for all requests, as well as post-response activities for
requests that do not require bridged API access

  2. Intercept and assume control of any bridged APIs to be used by
wrapped applications or subrequests (assuming the middleware knows how
to do this for a specific bridged API, and desires to do so)

  3. Disable any or even *all* bridged API access by its wrapped apps
-- even without prior knowledge of *which* APIs might be used -- in
the event that the middleware can only perform its intended function
by denying such access

3. Defining a way for WSGI servers to negotiate a smooth transition of
response handling between standard WSGI and their native API, while
safely detecting whether intervening middleware has taken over or
altered the response in a way that conflicts with elevating the
current request to native API processing

Non-goals include:

* Actually defining any specification for the bridged APIs themselves  ;-)


# Specification

The basic idea of this specification is to add a dictionary to the
WSGI environment, under the key `wsgi.upgrades`.  Within this
dictionary, a single ASCII string key is allocated for each non-WSGI
API offered by the server (or implemented via middleware).

So, for example, if Twisted were to offer an upgrade bridge, it might
register a `twisted` key within the `wsgi.upgrades` dictionary.  And
if uWSGI were to offer a websocket API bridge, it might register a
`uwsgi.websocket` key (perhaps conditionally on whether the current
request included a websocket upgrade header), and so on.

The registered key in the `wsgi.upgrades` dictionary MUST be an ASCII
string containing a dot-separated sequence of one or more valid Python
identifiers.  (So, `http2` and `http.v2` are valid API keys, but
`http.2` and `http/2` are NOT.)

The registered value, on the other hand, is a callable used to create
a bridge between a web application's request handler, and a handler
for the upgraded (non-WSGI, non-web framework) API.


## Providing an API

The implementation of an upgrade bridge consists of a callable object,
looking something like this pseudocode:

    def some_api_bridge(environ, start_response, XXX...):
        response_key = new_unique_header_compatible_string()
        current_request.response_registry[response_key] = XXX...
        start_response('399 WSGI-Bridge: '+response_key, [
            ('Content-Type', 'application/x-wsgi-bridge; id='+response_key),
            ('Content-Length', str(len(response_key)))
        ])
        return [response_key]

    environ.setdefault('wsgi.upgrades',{})['some_api'] = some_api_bridge

As you can see, this is a little bit like a WSGI application -- and in
fact it *is* a valid WSGI application, except that one or more
positional or keyword arguments (shown here as `XXX...`) are included
after the standard WSGI ones, to specify details of the desired
response handler.  Depending on the needs of the API, these arguments
could be a single "handler" callback, or they could be multiple
objects, callbacks, or configuration values.

The upgrade bridge's job is simply to generate a unique ASCII "native
string" key to be used in the bridging response as a substitute for
these additional arguments, and to register these arguments under that
key for future use by the server.  Finally, the bridge sends a WSGI
response as shown above, with the status, headers, and body all
containing the generated response key.

The server MUST NOT actually invoke or begin using the provided
handler until *after* the standard WSGI response process has been
completed, and it has verified that the response key is *still
present* in all three parts of the WSGI response: the status, headers,
and body.

The continued presence of the response key is used to verify three things:

1. That the registered response handler is indeed a response to the
original incoming request, and not merely a response to a subrequest
created by middleware

2. That intervening middleware hasn't replaced the bridging response
with a response of its own (for example, an error response created
because of an error occurring after the bridged handler was
registered, but before it was used)

3. *Which* response handler should be invoked, if more than one was registered

So, a server providing an upgrade bridge MUST wait until it receives a
WSGI response whose status, content-type, content-length, and body all
unequivocally identify which of the response handlers registered for
the current request should actually be used.

In the event that the status, type, and body all match each other, the
server MUST then activate the registered response handler for that
key, allowing the current request (and possibly subsequent requests,
depending on the API involved) to be handled via the associated API.
(It also MUST discard any other registered response handlers for the
current request.)

In the event that neither the status nor headers designate a
registered response handler, the server MUST treat the response as a
standard WSGI response, and discard all registered response handlers
for the current request.

In the event that the status and headers disagree on *which* handler
is to be used (or *whether* one is to be used at all), or in the event
that they *do* agree, but the body disagrees with them, or if all
three agree but the supplied ID was not registered for this request or
API, then the server MUST generate an error response, and discard both
the WSGI response and any registered handlers.  (In the face of
ambiguity, refuse the temptation to guess; errors should not pass
silently.)


### Response Key Details

The key used to distinguish responses MUST be an ASCII "native string"
(as defined by PEP 3333).  It SHOULD also be relatively short, and
MUST contain only those characters that are valid in a MIME "token".
(That is, it may contain any non-space, non-control ASCII character,
except the special characters `(`, `)`, `<`, `>`, `@`, `,`, `;`, `:`,
`\`, `"`, `/`, `[`, `]`, `?`, and `=`.)

Response keys generated for a given API MUST be unique for the
duration of a given request, and MUST be generated in such a way so as
not to collide with keys issued for any *other* API during the same
request.  (e.g., by including the API's name in them.)

Response keys SHOULD also be unique within the lifetime of the process
that generates them, e.g. by including a global counter value.

(So, the simplest way of generating a response key that conforms to
this spec is to just append a global counter to a string uniquely
identifying the chosen API.  However, there is nothing stopping a
server from adding other information like a request ID, channel
desginator, or other information in, as an aid to debugging.  Just
make sure there's no whitespace or special characters involved, as
mentioned above.)


### Closing and Resource Management

Because the bridging response may have been wrapped by middleware --
e.g. session middleware that saves updated session data on `.close()`,
database connection-pooling middleware that releases connections on
`.close()`, etc. -- the server MUST NOT invoke the WSGI response's
`.close()` method (if any) before the new response handler is
finished, in order to prevent premature resource release.

If the response protocol implements something like websockets, or an
extended HTTP/2 conversation, then the provided API SHOULD provide
some way for the response handler to explicitly ensure that the
response `.close()` method is called, at some point *before* the
conversation is completed and the connection is closed.

These two requirements exist because even if the response *content* is
not altered by middleware, it is still possible for middleware to
attach resource-release handlers to the WSGI response *object*.  If
these are not closed at all, or closed prematurely, it may cause
problems with the underlying web framework.

For example, some web frameworks offer a facility to tie database
transaction scope to request scope, so that when a request is
completely finished, the current transaction is automatically
committed, and a database connection may be returned to a pool.  A
response handler might then be in the position of trying to use a
connection that no longer "belonged" to it.

In the simpler, more common case of a single response to a single
request, deferring the `.close()` operation until the entire response
is completed will help to preserve existing framework behavior and
user expectations, so long as the framework is using a
`.close()`-based mechanism to control these other features.

Conversely, in the case where an extended conversation takes place,
the user may wish to signal completion earlier, in order to avoid
hanging on to unnecessary resources.

Of course, if a framework uses some other mechanism to allocate its
connections, scope its transactions, or do other resource management,
then that may impose certain limitations on the user with respect to
what framework features are still usable within a given response
handler.

Web frameworks supporting this spec MUST document what framework
features will be unavailable from within a bridged API response
handler (i.e. after the framework request handler returns a response),
and SHOULD provide alternate ways to access those features from a
response handler.

Further, a framework MAY intercept and wrap registered response
handlers (for APIs whose control flow they understand) in order to
transparently provide these features.  (However, since this has to be
done on an API-by-API basis, it's likely that most framework providers
will only offer this interception feature for a few,
community-standardized APIs.  But they may -- and perhaps already do
-- expose APIs that would let others do the necessary wrapping or
interception themselves.)


## Accessing an API

Now that we have seen both the application and server sides of the
bridging process, we can look at the bridge itself.  Essentially, the
bridging is done by:

1. Retrieving the appropriate upgrade bridge from the environ

2. Invoking that bridge as if it were a WSGI application, passing any
extra arguments required by the specific bridged API (such as a
handler)

3. Returning the bridge's WSGI response, as the WSGI response of the
current app or framework.

Here's an example, using a pure WSGI app and no web framework:

    def my_wsgi_app(environ, start_response):

        foobar_api = environ.get('wsgi.upgrades', {}).get('foobar')

        if foobar_api is None:
            # appropriate error action here
            # i.e. raise something, or return an error response

        def my_foobar_handler(foobar_specific_arg, another_foobar_arg, etc...):
            # code here that uses the foobar API to do something cool

        # Delegate the WSGI response to the foobar API
        return foobar_api(environ, start_response, my_foobar_handler)

However, since most application code *isn't* pure WSGI and *does* use
a framework, here's an example of how Django's `WSGIRequest` class
might implement our previously-illustrated `request.upgrade_to()`
method:

    def upgrade_to(self, api_name, *args, **kw):

        api_bridge = self.environ.get('wsgi.upgrades', {}).get(api_name)
        if api_bridge is None:
            raise RuntimeError("API unavailable")

        # Capture the bridging response as a Django response:
        response = StreamingHttpResponse()

        def start_response(status, headers):
            code, reason = status.split(' ', 1)
            response.status_code = int(code)
            response.reason_phrase = reason
            for h, v in headers:
                response[h] = v

        response.streaming_content = api_bridge(self.environ.copy(),
start_response)
        return response

And here's the `webob.Request` version of the same functionality
(which is a lot simpler, since WebOb already provides a way to capture
a WSGI app as a response):

    def upgrade_to(self, api_name, *args, **kw):
        api_bridge = self.environ.get('wsgi.upgrades', {}).get(api_name)
        if api_bridge is None:
            raise RuntimeError("API unavailable")
        return self.send(lambda env, s_r: api_bridge(env.copy(), s_r,
*args, **kw))

Individual web frameworks can of course decide how best to expose this
functionality to their users, whether via a request or response
method, controller method, special object to return, exception to
raise, or whatever other approach best suits their framework's API
paradigm.

(And of course, as long as the framework provides access to the WSGI
environ, and allows setting every aspect of the WSGI response, an
application developer can implement their own variation of the above,
without any extra assistance from the framework itself.)


## Intercepting, Disabling, or Upgrading API Bridges

Because all API upgrade bridges are contained in a single WSGI
environment key, it is easy for WSGI middleware to disable access to
them when creating subrequests, by simply deleting the entire
`wsgi.upgrades` key before invoking an application.

Likewise, in the event that WSGI middleware wishes to disable one
*specific* API, or intercept it, it can do so by removing or replacing
the appropriate bridge in the upgrades dictionary.

Last, but far from least, WSGI middleware can add *new* bridges to the
environment, though it should usually only do so if it implements the
new bridge in terms of a bridge that already exists.  (For example, to
provide a standardized wrapper over a server's native API, or to
emulate one server's API in terms of another server's API.)

These "middleware bridges" should work by delegating the actual
bridging process to the base API, e.g.:

    def api_standardizing_middleware(app):
        def standard_api_bridge(environ, start_response, std_handler):
            def native_handler(...):
                # translate/wrap native args to std args, then pass them on
                std_handler(...)
            native_api = environ['wsgi.upgrades']['native_api']
            return native_api(environ, start_response, native_handler)

        def wrapped_app(environ, start_response):
            upgrades = environ.setdefault('wsgi.upgrades', {})
            if 'native_api' in upgrades:
                upgrades['standard_api'] = standard_api_bridge
            return app(environ, start_response)

        return wrapped_app

In this example, we show a piece of middleware that converts some
server's native API (`native_api`) to some Python standard API
(`standard_api`), if the required native API is available at request
time.  It doesn't have to implement any other part of the bridging
specification, since the server's native API bridge will register and
invoke the native response handler (`native_handler`), which in turn
will invoke the "standardized" handler (`std_handler`).

So, all the middleware needs to do is accept handler arguments for the
API it wants to provide, and then register a linked handler with the
native API.  (Apart from the code shown above, everything else is just
whatever is needed to implement the actual API translation.)

This means that if a server exposes whatever its native API is, then
any number of translated, standardized, or simplified versions of that
API can be offered via middleware, without needing to alter the server
itself, or the server's core WSGI implementation.  Instead, those
other APIs can just be implemented via the existing native API bridge.

(Note: The `wsgi.upgrades` dictionary is to be considered volatile in
the same way as the WSGI environment is.  That is, apps or middleware
are allowed to modify or delete its contents freely, so a copy MUST be
saved by middleware if it wishes to access the original values after
it has been passed to another application or middleware.)


# Next Steps

Once this specification is stable, the next step is to implement
native server API bridges for existing web servers.  These do not
necessarily need to be provided by the server implementers themselves,
but they do need to be implemented in the server's native API, and
extend its WSGI implementation.

Because it is possible for API bridges to be layered or upgraded by
standard WSGI middleware, it is **not** necessary for servers to
directly support multiple APIs.  Servers can simply expose their
existing API as an API bridge, and let third parties implement
middleware to translate that API to any future standardized APIs.

As soon as even one such native API exists, it is immediately
beneficial for web frameworks to provide support for the bridging API,
and possible for framework users to supply their own.  (WebOb support
would be especially useful, since a significant number of web
frameworks base their request and response objects on WebOb.)

It may also be helpful to publish a reference library for response key
generation and response verification, along with perhaps a wsgiref
update or at least some sample code showing how to modify the wsgiref
request handler flow to initiate a bridge operation.


## Open Questions and Issues

* Transaction and object lifetimes -- is the current spec correct/sufficient?
* What if middleware adds headers but leaves the status and
content-type unchanged?  Should that be an error?  What happens if
middleware requests setting cookies?
* Do the chosen status/headers/body signatures actually make sense?
Do they even need to be more specified, less-specified?
* Are there any major obstacles to sending a special status from major
web frameworks?
* Should a different status be used?
* Are there any other ways to corrupt, confuse, or break this?
* What else am I missing, overlooking, or getting wrong?

## Notes on the Current Design Rationale

* A dictionary is used for all bridged APIs, so they can be easily
disabled for subrequests

* Multiple registrations are allowed, so that middleware invoking
multiple subrequests is unaffected, so long as exactly one
subrequest's response is returned to the top-level WSGI server

* A `Content-Type` header is part of the spec, because most
response-altering middleware should avoid altering content types it
does not understand, thereby increasing the likelihood that the
response will be passed through unchanged


# Acknowledgements

(TBD, but should definitely include Robert Collins for research,
inspiration, and use cases)


# References

TBD

[nghttp2]: http://nghttp2.org/documentation/package_README.html#python-bindings


# Copyright

This document has been placed in the public domain.


More information about the Web-SIG mailing list