[Python-checkins] r84842 - peps/trunk/pep-0444.txt

Brett Cannon brett at python.org
Thu Sep 16 01:15:33 CEST 2010


Can I just ask why 444 since 392 was the last assigned Python 2 number?

On Wed, Sep 15, 2010 at 15:40, georg.brandl <python-checkins at python.org> wrote:
> Author: georg.brandl
> Date: Thu Sep 16 00:40:38 2010
> New Revision: 84842
>
> Log:
> Add PEP 444, Python Web3 Interface.
>
> Added:
>   peps/trunk/pep-0444.txt   (contents, props changed)
>
> Added: peps/trunk/pep-0444.txt
> ==============================================================================
> --- (empty file)
> +++ peps/trunk/pep-0444.txt     Thu Sep 16 00:40:38 2010
> @@ -0,0 +1,1570 @@
> +PEP: 444
> +Title: Python Web3 Interface
> +Version: $Revision$
> +Last-Modified: $Date$
> +Author: Chris McDonough <chrism at plope.com>,
> +        Armin Ronacher <armin.ronacher at active-4.com>
> +Discussions-To: Python Web-SIG <web-sig at python.org>
> +Status: Draft
> +Type: Informational
> +Content-Type: text/x-rst
> +Created: 19-Jul-2010
> +
> +
> +Abstract
> +========
> +
> +This document specifies a proposed second-generation standard
> +interface between web servers and Python web applications or
> +frameworks.
> +
> +
> +Rationale and Goals
> +===================
> +
> +This protocol and specification is influenced heavily by the Web
> +Services Gateway Interface (WSGI) 1.0 standard described in PEP 333
> +[1]_ .  The high-level rationale for having any standard that allows
> +Python-based web servers and applications to interoperate is outlined
> +in PEP 333.  This document essentially uses PEP 333 as a template, and
> +changes its wording in various places for the purpose of forming a
> +different standard.
> +
> +Python currently boasts a wide variety of web application frameworks
> +which use the WSGI 1.0 protocol.  However, due to changes in the
> +language, the WSGI 1.0 protocol is not compatible with Python 3.  This
> +specification describes a standardized WSGI-like protocol that lets
> +Python 2.6, 2.7 and 3.1+ applications communicate with web servers.
> +Web3 is clearly a WSGI derivative; it only uses a different name than
> +"WSGI" in order to indicate that it is not in any way backwards
> +compatible.
> +
> +Applications and servers which are written to this specification are
> +meant to work properly under Python 2.6.X, Python 2.7.X and Python
> +3.1+.  Neither an application nor a server that implements the Web3
> +specification can be easily written which will work under Python 2
> +versions earlier than 2.6 nor Python 3 versions earlier than 3.1.
> +
> +.. note::
> +
> +   Whatever Python 3 version fixed http://bugs.python.org/issue4006 so
> +   ``os.environ['foo']`` returns surrogates (ala PEP 383) when the
> +   value of 'foo' cannot be decoded using the current locale instead
> +   of failing with a KeyError is the *true* minimum Python 3 version.
> +   In particular, however, Python 3.0 is not supported.
> +
> +.. note::
> +
> +   Python 2.6 is the first Python version that supported an alias for
> +   ``bytes`` and the ``b"foo"`` literal syntax.  This is why it is the
> +   minimum version supported by Web3.
> +
> +Explicability and documentability are the main technical drivers for
> +the decisions made within the standard.
> +
> +
> +Differences from WSGI
> +=====================
> +
> +- All protocol-specific environment names are prefixed with ``web3.``
> +  rather than ``wsgi.``, eg. ``web3.input`` rather than
> +  ``wsgi.input``.
> +
> +- All values present as environment dictionary *values* are explicitly
> +  *bytes* instances instead of native strings.  (Environment *keys*
> +  however are native strings, always ``str`` regardless of
> +  platform).
> +
> +- All values returned by an application must be bytes instances,
> +  including status code, header names and values, and the body.
> +
> +- Wherever WSGI 1.0 referred to an ``app_iter``, this specification
> +  refers to a ``body``.
> +
> +- No ``start_response()`` callback (and therefore no ``write()``
> +  callable nor ``exc_info`` data).
> +
> +- The ``readline()`` function of ``web3.input`` must support a size
> +  hint parameter.
> +
> +- The ``read()`` function of ``web3.input`` must be length delimited.
> +  A call without a size argument must not read more than the content
> +  length header specifies.  In case a content length header is absent
> +  the stream must not return anything on read.  It must never request
> +  more data than specified from the client.
> +
> +- No requirement for middleware to yield an empty string if it needs
> +  more information from an application to produce output (e.g. no
> +  "Middleware Handling of Block Boundaries").
> +
> +- Filelike objects passed to a "file_wrapper" must have an
> +  ``__iter__`` which returns bytes (never text).
> +
> +- ``wsgi.file_wrapper`` is not supported.
> +
> +- ``QUERY_STRING``, ``SCRIPT_NAME``, ``PATH_INFO`` values required to
> +  be placed in environ by server (each as the empty bytes instance if
> +  no associated value is received in the HTTP request).
> +
> +- ``web3.path_info`` and ``web3.script_name`` should be put into the
> +  Web3 environment, if possible, by the origin Web3 server.  When
> +  available, each is the original, plain 7-bit ASCII, URL-encoded
> +  variant of its CGI equivalent derived directly from the request URI
> +  (with %2F segment markers and other meta-characters intact).  If the
> +  server cannot provide one (or both) of these values, it must omit
> +  the value(s) it cannot provide from the environment.
> +
> +- This requirement was removed: "middleware components **must not**
> +  block iteration waiting for multiple values from an application
> +  iterable.  If the middleware needs to accumulate more data from the
> +  application before it can produce any output, it **must** yield an
> +  empty string."
> +
> +- ``SERVER_PORT`` must be a bytes instance (not an integer).
> +
> +- The server must not inject an additional ``Content-Length`` header
> +  by guessing the length from the response iterable.  This must be set
> +  by the application itself in all situations.
> +
> +- If the origin server advertises that it has the ``web3.async``
> +  capability, a Web3 application callable used by the server is
> +  permitted to return a callable that accepts no arguments.  When it
> +  does so, this callable is to be called periodically by the origin
> +  server until it returns a non-``None`` response, which must be a
> +  normal Web3 response tuple.
> +
> +  .. XXX (chrism) Needs a section of its own for explanation.
> +
> +
> +Specification Overview
> +======================
> +
> +The Web3 interface has two sides: the "server" or "gateway" side, and
> +the "application" or "framework" side.  The server side invokes a
> +callable object that is provided by the application side.  The
> +specifics of how that object is provided are up to the server or
> +gateway.  It is assumed that some servers or gateways will require an
> +application's deployer to write a short script to create an instance
> +of the server or gateway, and supply it with the application object.
> +Other servers and gateways may use configuration files or other
> +mechanisms to specify where an application object should be imported
> +from, or otherwise obtained.
> +
> +In addition to "pure" servers/gateways and applications/frameworks, it
> +is also possible to create "middleware" components that implement both
> +sides of this specification.  Such components act as an application to
> +their containing server, and as a server to a contained application,
> +and can be used to provide extended APIs, content transformation,
> +navigation, and other useful functions.
> +
> +Throughout this specification, we will use the term "application
> +callable" to mean "a function, a method, or an instance with a
> +``__call__`` method".  It is up to the server, gateway, or application
> +implementing the application callable to choose the appropriate
> +implementation technique for their needs.  Conversely, a server,
> +gateway, or application that is invoking a callable **must not** have
> +any dependency on what kind of callable was provided to it.
> +Application callables are only to be called, not introspected upon.
> +
> +
> +The Application/Framework Side
> +------------------------------
> +
> +The application object is simply a callable object that accepts one
> +argument.  The term "object" should not be misconstrued as requiring
> +an actual object instance: a function, method, or instance with a
> +``__call__`` method are all acceptable for use as an application
> +object.  Application objects must be able to be invoked more than
> +once, as virtually all servers/gateways (other than CGI) will make
> +such repeated requests.  It this cannot be guaranteed by the
> +implementation of the actual application, it has to be wrapped in a
> +function that creates a new instance on each call.
> +
> +.. note::
> +
> +   Although we refer to it as an "application" object, this should not
> +   be construed to mean that application developers will use Web3 as a
> +   web programming API.  It is assumed that application developers
> +   will continue to use existing, high-level framework services to
> +   develop their applications.  Web3 is a tool for framework and
> +   server developers, and is not intended to directly support
> +   application developers.)
> +
> +An example of an application which is a function (``simple_app``)::
> +
> +    def simple_app(environ):
> +        """Simplest possible application object"""
> +        status = b'200 OK'
> +        headers = [(b'Content-type', b'text/plain')]
> +        body = [b'Hello world!\n']
> +        return body, status, headers
> +
> +An example of an application which is an instance (``simple_app``)::
> +
> +    class AppClass(object):
> +
> +        """Produce the same output, but using an instance.  An
> +        instance of this class must be instantiated before it is
> +        passed to the server.  """
> +
> +      def __call__(self, environ):
> +            status = b'200 OK'
> +            headers = [(b'Content-type', b'text/plain')]
> +            body = [b'Hello world!\n']
> +            return body, status, headers
> +
> +    simple_app = AppClass()
> +
> +Alternately, an application callable may return a callable instead of
> +the tuple if the server supports asynchronous execution.  See
> +information concerning ``web3.async`` for more information.
> +
> +
> +The Server/Gateway Side
> +-----------------------
> +
> +The server or gateway invokes the application callable once for each
> +request it receives from an HTTP client, that is directed at the
> +application.  To illustrate, here is a simple CGI gateway, implemented
> +as a function taking an application object.  Note that this simple
> +example has limited error handling, because by default an uncaught
> +exception will be dumped to ``sys.stderr`` and logged by the web
> +server.
> +
> +::
> +
> +    import locale
> +    import os
> +    import sys
> +
> +    encoding = locale.getpreferredencoding()
> +
> +    stdout = sys.stdout
> +
> +    if hasattr(sys.stdout, 'buffer'):
> +        # Python 3 compatibility; we need to be able to push bytes out
> +        stdout = sys.stdout.buffer
> +
> +    def get_environ():
> +        d = {}
> +        for k, v in os.environ.items():
> +            # Python 3 compatibility
> +            if not isinstance(v, bytes):
> +                # We must explicitly encode the string to bytes under
> +                # Python 3.1+
> +                v = v.encode(encoding, 'surrogateescape')
> +            d[k] = v
> +        return d
> +
> +    def run_with_cgi(application):
> +
> +        environ = get_environ()
> +        environ['web3.input']        = sys.stdin
> +        environ['web3.errors']       = sys.stderr
> +        environ['web3.version']      = (1, 0)
> +        environ['web3.multithread']  = False
> +        environ['web3.multiprocess'] = True
> +        environ['web3.run_once']     = True
> +        environ['web3.async']        = False
> +
> +        if environ.get('HTTPS', b'off') in (b'on', b'1'):
> +            environ['web3.url_scheme'] = b'https'
> +        else:
> +            environ['web3.url_scheme'] = b'http'
> +
> +        rv = application(environ)
> +        if hasattr(rv, '__call__'):
> +            raise TypeError('This webserver does not support asynchronous '
> +                            'responses.')
> +        body, status, headers = rv
> +
> +        CLRF = b'\r\n'
> +
> +        try:
> +            stdout.write(b'Status: ' + status + CRLF)
> +            for header_name, header_val in headers:
> +                stdout.write(header_name + b': ' + header_val + CRLF)
> +            stdout.write(CRLF)
> +            for chunk in body:
> +                stdout.write(chunk)
> +                stdout.flush()
> +        finally:
> +            if hasattr(body, 'close'):
> +                body.close()
> +
> +
> +Middleware: Components that Play Both Sides
> +-------------------------------------------
> +
> +A single object may play the role of a server with respect to some
> +application(s), while also acting as an application with respect to
> +some server(s).  Such "middleware" components can perform such
> +functions as:
> +
> +* Routing a request to different application objects based on the
> +  target URL, after rewriting the ``environ`` accordingly.
> +
> +* Allowing multiple applications or frameworks to run side-by-side in
> +  the same process.
> +
> +* Load balancing and remote processing, by forwarding requests and
> +  responses over a network.
> +
> +* Perform content postprocessing, such as applying XSL stylesheets.
> +
> +The presence of middleware in general is transparent to both the
> +"server/gateway" and the "application/framework" sides of the
> +interface, and should require no special support.  A user who desires
> +to incorporate middleware into an application simply provides the
> +middleware component to the server, as if it were an application, and
> +configures the middleware component to invoke the application, as if
> +the middleware component were a server.  Of course, the "application"
> +that the middleware wraps may in fact be another middleware component
> +wrapping another application, and so on, creating what is referred to
> +as a "middleware stack".
> +
> +A middleware must support asychronous execution if possible or fall
> +back to disabling itself.
> +
> +Here a middleware that changes the ``HTTP_HOST`` key if an ``X-Host``
> +header exists and adds a comment to all html responses::
> +
> +    import time
> +
> +    def apply_filter(app, environ, filter_func):
> +        """Helper function that passes the return value from an
> +        application to a filter function when the results are
> +        ready.
> +        """
> +        app_response = app(environ)
> +
> +        # synchronous response, filter now
> +        if not hasattr(app_response, '__call__'):
> +            return filter_func(*app_response)
> +
> +        # asychronous response.  filter when results are ready
> +        def polling_function():
> +            rv = app_response()
> +            if rv is not None:
> +                return filter_func(*rv)
> +        return polling_function
> +
> +    def proxy_and_timing_support(app):
> +        def new_application(environ):
> +            def filter_func(body, status, headers):
> +                now = time.time()
> +                for key, value in headers:
> +                    if key.lower() == b'content-type' and \
> +                       value.split(b';')[0] == b'text/html':
> +                        # assumes ascii compatible encoding in body,
> +                        # but the middleware should actually parse the
> +                        # content type header and figure out the
> +                        # encoding when doing that.
> +                        body += ('<!-- Execution time: %.2fsec -->' %
> +                                 (now - then)).encode('ascii')
> +                        break
> +                return body, status, headers
> +            then = time.time()
> +            host = environ.get('HTTP_X_HOST')
> +            if host is not None:
> +                environ['HTTP_HOST'] = host
> +
> +            # use the apply_filter function that applies a given filter
> +            # function for both async and sync responses.
> +            return apply_filter(app, environ, filter_func)
> +        return new_application
> +
> +    app = proxy_and_timing_support(app)
> +
> +
> +Specification Details
> +=====================
> +
> +The application callable must accept one positional argument.  For the
> +sake of illustration, we have named it ``environ``, but it is not
> +required to have this name.  A server or gateway **must** invoke the
> +application object using a positional (not keyword) argument.
> +(E.g. by calling ``status, headers, body = application(environ)`` as
> +shown above.)
> +
> +The ``environ`` parameter is a dictionary object, containing CGI-style
> +environment variables.  This object **must** be a builtin Python
> +dictionary (*not* a subclass, ``UserDict`` or other dictionary
> +emulation), and the application is allowed to modify the dictionary in
> +any way it desires.  The dictionary must also include certain
> +Web3-required variables (described in a later section), and may also
> +include server-specific extension variables, named according to a
> +convention that will be described below.
> +
> +When called by the server, the application object must return a tuple
> +yielding three elements: ``status``, ``headers`` and ``body``, or, if
> +supported by an async server, an argumentless callable which either
> +returns ``None`` or a tuple of those three elements.
> +
> +The ``status`` element is a status in bytes of the form ``b'999
> +Message here'``.
> +
> +``headers`` is a Python list of ``(header_name, header_value)`` pairs
> +describing the HTTP response header.  The ``headers`` structure must
> +be a literal Python list; it must yield two-tuples.  Both
> +``header_name`` and ``header_value`` must be bytes values.
> +
> +The ``body`` is an iterable yielding zero or more bytes instances.
> +This can be accomplished in a variety of ways, such as by returning a
> +list containing bytes instances as ``body``, or by returning a
> +generator function as ``body`` that yields bytes instances, or by the
> +``body`` being an instance of a class which is iterable.  Regardless
> +of how it is accomplished, the application object must always return a
> +``body`` iterable yielding zero or more bytes instances.
> +
> +The server or gateway must transmit the yielded bytes to the client in
> +an unbuffered fashion, completing the transmission of each set of
> +bytes before requesting another one.  (In other words, applications
> +**should** perform their own buffering.  See the `Buffering and
> +Streaming`_ section below for more on how application output must be
> +handled.)
> +
> +The server or gateway should treat the yielded bytes as binary byte
> +sequences: in particular, it should ensure that line endings are not
> +altered.  The application is responsible for ensuring that the
> +string(s) to be written are in a format suitable for the client.  (The
> +server or gateway **may** apply HTTP transfer encodings, or perform
> +other transformations for the purpose of implementing HTTP features
> +such as byte-range transmission.  See `Other HTTP Features`_, below,
> +for more details.)
> +
> +If the ``body`` iterable returned by the application has a ``close()``
> +method, the server or gateway **must** call that method upon
> +completion of the current request, whether the request was completed
> +normally, or terminated early due to an error.  This is to support
> +resource release by the application amd is intended to complement PEP
> +325's generator support, and other common iterables with ``close()``
> +methods.
> +
> +Finally, servers and gateways **must not** directly use any other
> +attributes of the ``body`` iterable returned by the application.
> +
> +
> +``environ`` Variables
> +---------------------
> +
> +The ``environ`` dictionary is required to contain various CGI
> +environment variables, as defined by the Common Gateway Interface
> +specification [2]_.
> +
> +The following CGI variables **must** be present.  Each key is a native
> +string.  Each value is a bytes instance.
> +
> +.. note::
> +
> +   In Python 3.1+, a "native string" is a ``str`` type decoded using
> +   the ``surrogateescape`` error handler, as done by
> +   ``os.environ.__getitem__``.  In Python 2.6 and 2.7, a "native
> +   string" is a ``str`` types representing a set of bytes.
> +
> +``REQUEST_METHOD``
> +  The HTTP request method, such as ``"GET"`` or ``"POST"``.
> +
> +``SCRIPT_NAME``
> +  The initial portion of the request URL's "path" that corresponds to
> +  the application object, so that the application knows its virtual
> +  "location".  This may be the empty bytes instance if the application
> +  corresponds to the "root" of the server.  SCRIPT_NAME will be a
> +  bytes instance representing a sequence of URL-encoded segments
> +  separated by the slash character (``/``).  It is assumed that
> +  ``%2F`` characters will be decoded into literal slash characters
> +  within ``PATH_INFO`` , as per CGI.
> +
> +``PATH_INFO``
> +  The remainder of the request URL's "path", designating the virtual
> +  "location" of the request's target within the application.  This
> +  **may** be a bytes instance if the request URL targets the
> +  application root and does not have a trailing slash.  PATH_INFO will
> +  be a bytes instance representing a sequence of URL-encoded segments
> +  separated by the slash character (``/``).  It is assumed that
> +  ``%2F`` characters will be decoded into literal slash characters
> +  within ``PATH_INFO`` , as per CGI.
> +
> +``QUERY_STRING``
> +  The portion of the request URL (in bytes) that follows the ``"?"``,
> +  if any, or the empty bytes instance.
> +
> +``SERVER_NAME``, ``SERVER_PORT``
> +  When combined with ``SCRIPT_NAME`` and ``PATH_INFO`` (or their raw
> +  equivalents)`, these variables can be used to complete the URL.
> +  Note, however, that ``HTTP_HOST``, if present, should be used in
> +  preference to ``SERVER_NAME`` for reconstructing the request URL.
> +  See the `URL Reconstruction`_ section below for more detail.
> +  ``SERVER_PORT`` should be a bytes instance, not an integer.
> +
> +``SERVER_PROTOCOL``
> +  The version of the protocol the client used to send the request.
> +  Typically this will be something like ``"HTTP/1.0"`` or
> +  ``"HTTP/1.1"`` and may be used by the application to determine how
> +  to treat any HTTP request headers.  (This variable should probably
> +  be called ``REQUEST_PROTOCOL``, since it denotes the protocol used
> +  in the request, and is not necessarily the protocol that will be
> +  used in the server's response.  However, for compatibility with CGI
> +  we have to keep the existing name.)
> +
> +The following CGI values **may** present be in the Web3 environment.
> +Each key is a native string.  Each value is a bytes instances.
> +
> +``CONTENT_TYPE``
> +  The contents of any ``Content-Type`` fields in the HTTP request.
> +
> +``CONTENT_LENGTH``
> +  The contents of any ``Content-Length`` fields in the HTTP request.
> +
> +``HTTP_`` Variables
> +  Variables corresponding to the client-supplied HTTP request headers
> +  (i.e., variables whose names begin with ``"HTTP_"``).  The presence
> +  or absence of these variables should correspond with the presence or
> +  absence of the appropriate HTTP header in the request.
> +
> +A server or gateway **should** attempt to provide as many other CGI
> +variables as are applicable, each with a string for its key and a
> +bytes instance for its value.  In addition, if SSL is in use, the
> +server or gateway **should** also provide as many of the Apache SSL
> +environment variables [5]_ as are applicable, such as ``HTTPS=on`` and
> +``SSL_PROTOCOL``.  Note, however, that an application that uses any
> +CGI variables other than the ones listed above are necessarily
> +non-portable to web servers that do not support the relevant
> +extensions.  (For example, web servers that do not publish files will
> +not be able to provide a meaningful ``DOCUMENT_ROOT`` or
> +``PATH_TRANSLATED``.)
> +
> +A Web3-compliant server or gateway **should** document what variables
> +it provides, along with their definitions as appropriate.
> +Applications **should** check for the presence of any variables they
> +require, and have a fallback plan in the event such a variable is
> +absent.
> +
> +Note that CGI variable *values* must be bytes instances, if they are
> +present at all.  It is a violation of this specification for a CGI
> +variable's value to be of any type other than ``bytes``.  On Python 2,
> +this means they will be of type ``str``.  On Python 3, this means they
> +will be of type ``bytes``.
> +
> +They *keys* of all CGI and non-CGI variables in the environ, however,
> +must be "native strings" (on both Python 2 and Python 3, they will be
> +of type ``str``).
> +
> +In addition to the CGI-defined variables, the ``environ`` dictionary
> +**may** also contain arbitrary operating-system "environment
> +variables", and **must** contain the following Web3-defined variables.
> +
> +=====================  ===============================================
> +Variable               Value
> +=====================  ===============================================
> +``web3.version``       The tuple ``(1, 0)``, representing Web3
> +                       version 1.0.
> +
> +``web3.url_scheme``    A bytes value representing the "scheme" portion of
> +                       the URL at which the application is being
> +                       invoked.  Normally, this will have the value
> +                       ``b"http"`` or ``b"https"``, as appropriate.
> +
> +``web3.input``         An input stream (file-like object) from which bytes
> +                       constituting the HTTP request body can be read.
> +                       (The server or gateway may perform reads
> +                       on-demand as requested by the application, or
> +                       it may pre- read the client's request body and
> +                       buffer it in-memory or on disk, or use any
> +                       other technique for providing such an input
> +                       stream, according to its preference.)
> +
> +``web3.errors``        An output stream (file-like object) to which error
> +                       output text can be written, for the purpose of
> +                       recording program or other errors in a
> +                       standardized and possibly centralized location.
> +                       This should be a "text mode" stream; i.e.,
> +                       applications should use ``"\n"`` as a line
> +                       ending, and assume that it will be converted to
> +                       the correct line ending by the server/gateway.
> +                       Applications may *not* send bytes to the
> +                       'write' method of this stream; they may only
> +                       send text.
> +
> +                       For many servers, ``web3.errors`` will be the
> +                       server's main error log. Alternatively, this
> +                       may be ``sys.stderr``, or a log file of some
> +                       sort.  The server's documentation should
> +                       include an explanation of how to configure this
> +                       or where to find the recorded output.  A server
> +                       or gateway may supply different error streams
> +                       to different applications, if this is desired.
> +
> +``web3.multithread``   This value should evaluate true if the
> +                       application object may be simultaneously
> +                       invoked by another thread in the same process,
> +                       and should evaluate false otherwise.
> +
> +``web3.multiprocess``  This value should evaluate true if an
> +                       equivalent application object may be
> +                       simultaneously invoked by another process, and
> +                       should evaluate false otherwise.
> +
> +``web3.run_once``      This value should evaluate true if the server
> +                       or gateway expects (but does not guarantee!)
> +                       that the application will only be invoked this
> +                       one time during the life of its containing
> +                       process.  Normally, this will only be true for
> +                       a gateway based on CGI (or something similar).
> +
> +``web3.script_name``   The non-URL-decoded ``SCRIPT_NAME`` value.
> +                       Through a historical inequity, by virtue of the
> +                       CGI specification, ``SCRIPT_NAME`` is present
> +                       within the environment as an already
> +                       URL-decoded string.  This is the original
> +                       URL-encoded value derived from the request URI.
> +                       If the server cannot provide this value, it
> +                       must omit it from the environ.
> +
> +``web3.path_info``     The non-URL-decoded ``PATH_INFO`` value.
> +                       Through a historical inequity, by virtue of the
> +                       CGI specification, ``PATH_INFO`` is present
> +                       within the environment as an already
> +                       URL-decoded string.  This is the original
> +                       URL-encoded value derived from the request URI.
> +                       If the server cannot provide this value, it
> +                       must omit it from the environ.
> +
> +``web3.async``         This is ``True`` if the webserver supports
> +                       async invocation.  In that case an application
> +                       is allowed to return a callable instead of a
> +                       tuple with the response.  The exact semantics
> +                       are not specified by this specification.
> +
> +=====================  ===============================================
> +
> +Finally, the ``environ`` dictionary may also contain server-defined
> +variables.  These variables should have names which are native
> +strings, composed of only lower-case letters, numbers, dots, and
> +underscores, and should be prefixed with a name that is unique to the
> +defining server or gateway.  For example, ``mod_web3`` might define
> +variables with names like ``mod_web3.some_variable``.
> +
> +
> +Input Stream
> +~~~~~~~~~~~~
> +
> +The input stream (``web3.input``) provided by the server must support
> +the following methods:
> +
> +=====================  ========
> +Method                 Notes
> +=====================  ========
> +``read(size)``         1,4
> +``readline([size])``   1,2,4
> +``readlines([size])``  1,3,4
> +``__iter__()``         4
> +=====================  ========
> +
> +The semantics of each method are as documented in the Python Library
> +Reference, except for these notes as listed in the table above:
> +
> +1. The server is not required to read past the client's specified
> +   ``Content-Length``, and is allowed to simulate an end-of-file
> +   condition if the application attempts to read past that point.  The
> +   application **should not** attempt to read more data than is
> +   specified by the ``CONTENT_LENGTH`` variable.
> +
> +2. The implementation must support the optional ``size`` argument to
> +   ``readline()``.
> +
> +3. The application is free to not supply a ``size`` argument to
> +   ``readlines()``, and the server or gateway is free to ignore the
> +   value of any supplied ``size`` argument.
> +
> +4. The ``read``, ``readline`` and ``__iter__`` methods must return a
> +   bytes instance.  The ``readlines`` method must return a sequence
> +   which contains instances of bytes.
> +
> +The methods listed in the table above **must** be supported by all
> +servers conforming to this specification.  Applications conforming to
> +this specification **must not** use any other methods or attributes of
> +the ``input`` object.  In particular, applications **must not**
> +attempt to close this stream, even if it possesses a ``close()``
> +method.
> +
> +The input stream should silently ignore attempts to read more than the
> +content length of the request.  If no content length is specified the
> +stream must be a dummy stream that does not return anything.
> +
> +
> +Error Stream
> +~~~~~~~~~~~~
> +
> +The error stream (``web3.errors``) provided by the server must support
> +the following methods:
> +
> +===================   ==========  ========
> +Method                Stream      Notes
> +===================   ==========  ========
> +``flush()``           ``errors``  1
> +``write(str)``        ``errors``  2
> +``writelines(seq)``   ``errors``  2
> +===================   ==========  ========
> +
> +The semantics of each method are as documented in the Python Library
> +Reference, except for these notes as listed in the table above:
> +
> +1. Since the ``errors`` stream may not be rewound, servers and
> +   gateways are free to forward write operations immediately, without
> +   buffering.  In this case, the ``flush()`` method may be a no-op.
> +   Portable applications, however, cannot assume that output is
> +   unbuffered or that ``flush()`` is a no-op.  They must call
> +   ``flush()`` if they need to ensure that output has in fact been
> +   written.  (For example, to minimize intermingling of data from
> +   multiple processes writing to the same error log.)
> +
> +2. The ``write()`` method must accept a string argument, but needn't
> +   necessarily accept a bytes argument.  The ``writelines()`` method
> +   must accept a sequence argument that consists entirely of strings,
> +   but needn't necessarily accept any bytes instance as a member of
> +   the sequence.
> +
> +The methods listed in the table above **must** be supported by all
> +servers conforming to this specification.  Applications conforming to
> +this specification **must not** use any other methods or attributes of
> +the ``errors`` object.  In particular, applications **must not**
> +attempt to close this stream, even if it possesses a ``close()``
> +method.
> +
> +
> +Values Returned by A Web3 Application
> +-------------------------------------
> +
> +Web3 applications return an iterable in the form (``status``,
> +``headers``, ``body``).  The return value can be any iterable type
> +that returns exactly three values.  If the server supports
> +asynchronous applications (``web3.async``), the response may be a
> +callable object (which accepts no arguments).
> +
> +The ``status`` value is assumed by a gateway or server to be an HTTP
> +"status" bytes instance like ``b'200 OK'`` or ``b'404 Not Found'``.
> +That is, it is a string consisting of a Status-Code and a
> +Reason-Phrase, in that order and separated by a single space, with no
> +surrounding whitespace or other characters.  (See RFC 2616, Section
> +6.1.1 for more information.)  The string **must not** contain control
> +characters, and must not be terminated with a carriage return,
> +linefeed, or combination thereof.
> +
> +The ``headers`` value is assumed by a gateway or server to be a
> +literal Python list of ``(header_name, header_value)`` tuples.  Each
> +``header_name`` must be a bytes instance representing a valid HTTP
> +header field-name (as defined by RFC 2616, Section 4.2), without a
> +trailing colon or other punctuation.  Each ``header_value`` must be a
> +bytes instance and **must not** include any control characters,
> +including carriage returns or linefeeds, either embedded or at the
> +end.  (These requirements are to minimize the complexity of any
> +parsing that must be performed by servers, gateways, and intermediate
> +response processors that need to inspect or modify response headers.)
> +
> +In general, the server or gateway is responsible for ensuring that
> +correct headers are sent to the client: if the application omits a
> +header required by HTTP (or other relevant specifications that are in
> +effect), the server or gateway **must** add it.  For example, the HTTP
> +``Date:`` and ``Server:`` headers would normally be supplied by the
> +server or gateway.  The gateway must however not override values with
> +the same name if they are emitted by the application.
> +
> +(A reminder for server/gateway authors: HTTP header names are
> +case-insensitive, so be sure to take that into consideration when
> +examining application-supplied headers!)
> +
> +Applications and middleware are forbidden from using HTTP/1.1
> +"hop-by-hop" features or headers, any equivalent features in HTTP/1.0,
> +or any headers that would affect the persistence of the client's
> +connection to the web server.  These features are the exclusive
> +province of the actual web server, and a server or gateway **should**
> +consider it a fatal error for an application to attempt sending them,
> +and raise an error if they are supplied as return values from an
> +application in the ``headers`` structure.  (For more specifics on
> +"hop-by-hop" features and headers, please see the `Other HTTP
> +Features`_ section below.)
> +
> +
> +Dealing with Compatibility Across Python Versions
> +-------------------------------------------------
> +
> +Creating Web3 code that runs under both Python 2.6/2.7 and Python 3.1+
> +requires some care on the part of the developer.  In general, the Web3
> +specification assumes a certain level of equivalence between the
> +Python 2 ``str`` type and the Python 3 ``bytes`` type.  For example,
> +under Python 2, the values present in the Web3 ``environ`` will be
> +instances of the ``str`` type; in Python 3, these will be instances of
> +the ``bytes`` type.  The Python 3 ``bytes`` type does not possess all
> +the methods of the Python 2 ``str`` type, and some methods which it
> +does possess behave differently than the Python 2 ``str`` type.
> +Effectively, to ensure that Web3 middleware and applications work
> +across Python versions, developers must do these things:
> +
> +#) Do not assume comparison equivalence between text values and bytes
> +   values.  If you do so, your code may work under Python 2, but it
> +   will not work properly under Python 3.  For example, don't write
> +   ``somebytes == 'abc'``.  This will sometimes be true on Python 2
> +   but it will never be true on Python 3, because a sequence of bytes
> +   never compares equal to a string under Python 3.  Instead, always
> +   compare a bytes value with a bytes value, e.g. "somebytes ==
> +   b'abc'".  Code which does this is compatible with and works the
> +   same in Python 2.6, 2.7, and 3.1.  The ``b`` in front of ``'abc'``
> +   signals to Python 3 that the value is a literal bytes instance;
> +   under Python 2 it's a forward compatibility placebo.
> +
> +#) Don't use the ``__contains__`` method (directly or indirectly) of
> +   items that are meant to be byteslike without ensuring that its
> +   argument is also a bytes instance.  If you do so, your code may
> +   work under Python 2, but it will not work properly under Python 3.
> +   For example, ``'abc' in somebytes'`` will raise a ``TypeError``
> +   under Python 3, but it will return ``True`` under Python 2.6 and
> +   2.7.  However, ``b'abc' in somebytes`` will work the same on both
> +   versions.  In Python 3.2, this restriction may be partially
> +   removed, as it's rumored that bytes types may obtain a ``__mod__``
> +   implementation.
> +
> +#) ``__getitem__`` should not be used.
> +
> +   .. XXX
> +
> +#) Dont try to use the ``format`` method or the ``__mod__`` method of
> +   instances of bytes (directly or indirectly).  In Python 2, the
> +   ``str`` type which we treat equivalently to Python 3's ``bytes``
> +   supports these method but actual Python 3's ``bytes`` instances
> +   don't support these methods.  If you use these methods, your code
> +   will work under Python 2, but not under Python 3.
> +
> +#) Do not try to concatenate a bytes value with a string value.  This
> +   may work under Python 2, but it will not work under Python 3.  For
> +   example, doing ``'abc' + somebytes`` will work under Python 2, but
> +   it will result in a ``TypeError`` under Python 3.  Instead, always
> +   make sure you're concatenating two items of the same type,
> +   e.g. ``b'abc' + somebytes``.
> +
> +Web3 expects byte values in other places, such as in all the values
> +returned by an application.
> +
> +In short, to ensure compatibility of Web3 application code between
> +Python 2 and Python 3, in Python 2, treat CGI and server variable
> +values in the environment as if they had the Python 3 ``bytes`` API
> +even though they actually have a more capable API.  Likewise for all
> +stringlike values returned by a Web3 application.
> +
> +
> +Buffering and Streaming
> +-----------------------
> +
> +Generally speaking, applications will achieve the best throughput by
> +buffering their (modestly-sized) output and sending it all at once.
> +This is a common approach in existing frameworks: the output is
> +buffered in a StringIO or similar object, then transmitted all at
> +once, along with the response headers.
> +
> +The corresponding approach in Web3 is for the application to simply
> +return a single-element ``body`` iterable (such as a list) containing
> +the response body as a single string.  This is the recommended
> +approach for the vast majority of application functions, that render
> +HTML pages whose text easily fits in memory.
> +
> +For large files, however, or for specialized uses of HTTP streaming
> +(such as multipart "server push"), an application may need to provide
> +output in smaller blocks (e.g. to avoid loading a large file into
> +memory).  It's also sometimes the case that part of a response may be
> +time-consuming to produce, but it would be useful to send ahead the
> +portion of the response that precedes it.
> +
> +In these cases, applications will usually return a ``body`` iterator
> +(often a generator-iterator) that produces the output in a
> +block-by-block fashion.  These blocks may be broken to coincide with
> +mulitpart boundaries (for "server push"), or just before
> +time-consuming tasks (such as reading another block of an on-disk
> +file).
> +
> +Web3 servers, gateways, and middleware **must not** delay the
> +transmission of any block; they **must** either fully transmit the
> +block to the client, or guarantee that they will continue transmission
> +even while the application is producing its next block.  A
> +server/gateway or middleware may provide this guarantee in one of
> +three ways:
> +
> +1. Send the entire block to the operating system (and request that any
> +   O/S buffers be flushed) before returning control to the
> +   application, OR
> +
> +2. Use a different thread to ensure that the block continues to be
> +   transmitted while the application produces the next block.
> +
> +3. (Middleware only) send the entire block to its parent
> +   gateway/server.
> +
> +By providing this guarantee, Web3 allows applications to ensure that
> +transmission will not become stalled at an arbitrary point in their
> +output data.  This is critical for proper functioning of
> +e.g. multipart "server push" streaming, where data between multipart
> +boundaries should be transmitted in full to the client.
> +
> +
> +Unicode Issues
> +--------------
> +
> +HTTP does not directly support Unicode, and neither does this
> +interface.  All encoding/decoding must be handled by the
> +**application**; all values passed to or from the server must be of
> +the Python 3 type ``bytes`` or instances of the Python 2 type ``str``,
> +not Python 2 ``unicode`` or Python 3 ``str`` objects.
> +
> +All "bytes instances" referred to in this specification **must**:
> +
> +- On Python 2, be of type ``str``.
> +
> +- On Python 3, be of type ``bytes``.
> +
> +All "bytes instances" **must not** :
> +
> +- On Python 2,  be of type ``unicode``.
> +
> +- On Python 3, be of type ``str``.
> +
> +The result of using a textlike object where a byteslike object is
> +required is undefined.
> +
> +Values returned from a Web3 app as a status or as response headers
> +**must** follow RFC 2616 with respect to encoding.  That is, the bytes
> +returned must contain a character stream of ISO-8859-1 characters, or
> +the character stream should use RFC 2047 MIME encoding.
> +
> +On Python platforms which do not have a native bytes-like type
> +(e.g. IronPython, etc.), but instead which generally use textlike
> +strings to represent bytes data, the definition of "bytes instance"
> +can be changed: their "bytes instances" must be native strings that
> +contain only code points representable in ISO-8859-1 encoding
> +(``\u0000`` through ``\u00FF``, inclusive).  It is a fatal error for
> +an application on such a platform to supply strings containing any
> +other Unicode character or code point.  Similarly, servers and
> +gateways on those platforms **must not** supply strings to an
> +application containing any other Unicode characters.
> +
> +.. XXX (armin: Jython now has a bytes type, we might remove this
> +   section after seeing about IronPython)
> +
> +
> +HTTP 1.1 Expect/Continue
> +------------------------
> +
> +Servers and gateways that implement HTTP 1.1 **must** provide
> +transparent support for HTTP 1.1's "expect/continue" mechanism.  This
> +may be done in any of several ways:
> +
> +1. Respond to requests containing an ``Expect: 100-continue`` request
> +   with an immediate "100 Continue" response, and proceed normally.
> +
> +2. Proceed with the request normally, but provide the application with
> +   a ``web3.input`` stream that will send the "100 Continue" response
> +   if/when the application first attempts to read from the input
> +   stream.  The read request must then remain blocked until the client
> +   responds.
> +
> +3. Wait until the client decides that the server does not support
> +   expect/continue, and sends the request body on its own.  (This is
> +   suboptimal, and is not recommended.)
> +
> +Note that these behavior restrictions do not apply for HTTP 1.0
> +requests, or for requests that are not directed to an application
> +object.  For more information on HTTP 1.1 Expect/Continue, see RFC
> +2616, sections 8.2.3 and 10.1.1.
> +
> +
> +Other HTTP Features
> +-------------------
> +
> +In general, servers and gateways should "play dumb" and allow the
> +application complete control over its output.  They should only make
> +changes that do not alter the effective semantics of the application's
> +response.  It is always possible for the application developer to add
> +middleware components to supply additional features, so server/gateway
> +developers should be conservative in their implementation.  In a
> +sense, a server should consider itself to be like an HTTP "gateway
> +server", with the application being an HTTP "origin server".  (See RFC
> +2616, section 1.3, for the definition of these terms.)
> +
> +However, because Web3 servers and applications do not communicate via
> +HTTP, what RFC 2616 calls "hop-by-hop" headers do not apply to Web3
> +internal communications.  Web3 applications **must not** generate any
> +"hop-by-hop" headers [4]_, attempt to use HTTP features that would
> +require them to generate such headers, or rely on the content of any
> +incoming "hop-by-hop" headers in the ``environ`` dictionary.  Web3
> +servers **must** handle any supported inbound "hop-by-hop" headers on
> +their own, such as by decoding any inbound ``Transfer-Encoding``,
> +including chunked encoding if applicable.
> +
> +Applying these principles to a variety of HTTP features, it should be
> +clear that a server **may** handle cache validation via the
> +``If-None-Match`` and ``If-Modified-Since`` request headers and the
> +``Last-Modified`` and ``ETag`` response headers.  However, it is not
> +required to do this, and the application **should** perform its own
> +cache validation if it wants to support that feature, since the
> +server/gateway is not required to do such validation.
> +
> +Similarly, a server **may** re-encode or transport-encode an
> +application's response, but the application **should** use a suitable
> +content encoding on its own, and **must not** apply a transport
> +encoding.  A server **may** transmit byte ranges of the application's
> +response if requested by the client, and the application doesn't
> +natively support byte ranges.  Again, however, the application
> +**should** perform this function on its own if desired.
> +
> +Note that these restrictions on applications do not necessarily mean
> +that every application must reimplement every HTTP feature; many HTTP
> +features can be partially or fully implemented by middleware
> +components, thus freeing both server and application authors from
> +implementing the same features over and over again.
> +
> +
> +Thread Support
> +--------------
> +
> +Thread support, or lack thereof, is also server-dependent.  Servers
> +that can run multiple requests in parallel, **should** also provide
> +the option of running an application in a single-threaded fashion, so
> +that applications or frameworks that are not thread-safe may still be
> +used with that server.
> +
> +
> +Implementation/Application Notes
> +================================
> +
> +Server Extension APIs
> +---------------------
> +
> +Some server authors may wish to expose more advanced APIs, that
> +application or framework authors can use for specialized purposes.
> +For example, a gateway based on ``mod_python`` might wish to expose
> +part of the Apache API as a Web3 extension.
> +
> +In the simplest case, this requires nothing more than defining an
> +``environ`` variable, such as ``mod_python.some_api``.  But, in many
> +cases, the possible presence of middleware can make this difficult.
> +For example, an API that offers access to the same HTTP headers that
> +are found in ``environ`` variables, might return different data if
> +``environ`` has been modified by middleware.
> +
> +In general, any extension API that duplicates, supplants, or bypasses
> +some portion of Web3 functionality runs the risk of being incompatible
> +with middleware components.  Server/gateway developers should *not*
> +assume that nobody will use middleware, because some framework
> +developers specifically organize their frameworks to function almost
> +entirely as middleware of various kinds.
> +
> +So, to provide maximum compatibility, servers and gateways that
> +provide extension APIs that replace some Web3 functionality, **must**
> +design those APIs so that they are invoked using the portion of the
> +API that they replace.  For example, an extension API to access HTTP
> +request headers must require the application to pass in its current
> +``environ``, so that the server/gateway may verify that HTTP headers
> +accessible via the API have not been altered by middleware.  If the
> +extension API cannot guarantee that it will always agree with
> +``environ`` about the contents of HTTP headers, it must refuse service
> +to the application, e.g. by raising an error, returning ``None``
> +instead of a header collection, or whatever is appropriate to the API.
> +
> +These guidelines also apply to middleware that adds information such
> +as parsed cookies, form variables, sessions, and the like to
> +``environ``.  Specifically, such middleware should provide these
> +features as functions which operate on ``environ``, rather than simply
> +stuffing values into ``environ``.  This helps ensure that information
> +is calculated from ``environ`` *after* any middleware has done any URL
> +rewrites or other ``environ`` modifications.
> +
> +It is very important that these "safe extension" rules be followed by
> +both server/gateway and middleware developers, in order to avoid a
> +future in which middleware developers are forced to delete any and all
> +extension APIs from ``environ`` to ensure that their mediation isn't
> +being bypassed by applications using those extensions!
> +
> +
> +Application Configuration
> +-------------------------
> +
> +This specification does not define how a server selects or obtains an
> +application to invoke.  These and other configuration options are
> +highly server-specific matters.  It is expected that server/gateway
> +authors will document how to configure the server to execute a
> +particular application object, and with what options (such as
> +threading options).
> +
> +Framework authors, on the other hand, should document how to create an
> +application object that wraps their framework's functionality.  The
> +user, who has chosen both the server and the application framework,
> +must connect the two together.  However, since both the framework and
> +the server have a common interface, this should be merely a mechanical
> +matter, rather than a significant engineering effort for each new
> +server/framework pair.
> +
> +Finally, some applications, frameworks, and middleware may wish to use
> +the ``environ`` dictionary to receive simple string configuration
> +options.  Servers and gateways **should** support this by allowing an
> +application's deployer to specify name-value pairs to be placed in
> +``environ``.  In the simplest case, this support can consist merely of
> +copying all operating system-supplied environment variables from
> +``os.environ`` into the ``environ`` dictionary, since the deployer in
> +principle can configure these externally to the server, or in the CGI
> +case they may be able to be set via the server's configuration files.
> +
> +Applications **should** try to keep such required variables to a
> +minimum, since not all servers will support easy configuration of
> +them.  Of course, even in the worst case, persons deploying an
> +application can create a script to supply the necessary configuration
> +values::
> +
> +   from the_app import application
> +
> +   def new_app(environ):
> +       environ['the_app.configval1'] = b'something'
> +       return application(environ)
> +
> +But, most existing applications and frameworks will probably only need
> +a single configuration value from ``environ``, to indicate the
> +location of their application or framework-specific configuration
> +file(s).  (Of course, applications should cache such configuration, to
> +avoid having to re-read it upon each invocation.)
> +
> +
> +URL Reconstruction
> +------------------
> +
> +If an application wishes to reconstruct a request's complete URL (as a
> +bytes object), it may do so using the following algorithm::
> +
> +    host = environ.get('HTTP_HOST')
> +
> +    scheme = environ['web3.url_scheme']
> +    port = environ['SERVER_PORT']
> +    query = environ['QUERY_STRING']
> +
> +    url = scheme + b'://'
> +
> +    if host:
> +        url += host
> +    else:
> +        url += environ['SERVER_NAME']
> +
> +        if scheme == b'https':
> +            if port != b'443':
> +               url += b':' + port
> +        else:
> +            if port != b'80':
> +               url += b':' + port
> +
> +    if 'web3.script_name' in url:
> +        url += url_quote(environ['web3.script_name'])
> +    else:
> +        url += environ['SCRIPT_NAME']
> +    if 'web3.path_info' in environ:
> +        url += url_quote(environ['web3.path_info'])
> +    else:
> +        url += environ['PATH_INFO']
> +    if query:
> +        url += b'?' + query
> +
> +Note that such a reconstructed URL may not be precisely the same URI
> +as requested by the client.  Server rewrite rules, for example, may
> +have modified the client's originally requested URL to place it in a
> +canonical form.
> +
> +
> +Open Questions
> +==============
> +
> +- ``file_wrapper`` replacement.  Currently nothing is specified here
> +  but it's clear that the old system of in-band signalling is broken
> +  if it does not provide a way to figure out as a middleware in the
> +  process if the response is a file wrapper.
> +
> +
> +Points of Contention
> +====================
> +
> +Outlined below are potential points of contention regarding this
> +specification.
> +
> +
> +WSGI 1.0 Compatibility
> +----------------------
> +
> +Components written using the WSGI 1.0 specification will not
> +transparently interoperate with components written using this
> +specification.  That's because the goals of this proposal and the
> +goals of WSGI 1.0 are not directly aligned.
> +
> +WSGI 1.0 is obliged to provide specification-level backwards
> +compatibility with versions of Python between 2.2 and 2.7.  This
> +specification, however, ditches Python 2.5 and lower compatibility in
> +order to provide compatibility between relatively recent versions of
> +Python 2 (2.6 and 2.7) as well as relatively recent versions of Python
> +3 (3.1).
> +
> +It is currently impossible to write components which work reliably
> +under both Python 2 and Python 3 using the WSGI 1.0 specification,
> +because the specification implicitly posits that CGI and server
> +variable values in the environ and values returned via
> +``start_response`` represent a sequence of bytes that can be addressed
> +using the Python 2 string API.  It posits such a thing because that
> +sort of data type was the sensible way to represent bytes in all
> +Python 2 versions, and WSGI 1.0 was conceived before Python 3 existed.
> +
> +Python 3's ``str`` type supports the full API provided by the Python 2
> +``str`` type, but Python 3's ``str`` type does not represent a
> +sequence of bytes, it instead represents text.  Therefore, using it to
> +represent environ values also requires that the environ byte sequence
> +be decoded to text via some encoding.  We cannot decode these bytes to
> +text (at least in any way where the decoding has any meaning other
> +than as a tunnelling mechanism) without widening the scope of WSGI to
> +include server and gateway knowledge of decoding policies and
> +mechanics.  WSGI 1.0 never concerned itself with encoding and
> +decoding.  It made statements about allowable transport values, and
> +suggested that various values might be best decoded as one encoding or
> +another, but it never required a server to *perform* any decoding
> +before
> +
> +Python 3 does not have a stringlike type that can be used instead to
> +represent bytes: it has a ``bytes`` type.  A bytes type operates quite
> +a bit like a Python 2 ``str`` in Python 3.1+, but it lacks behavior
> +equivalent to ``str.__mod__`` and its iteration protocol, and
> +containment, sequence treatment, and equivalence comparisons are
> +different.
> +
> +In either case, there is no type in Python 3 that behaves just like
> +the Python 2 ``str`` type, and a way to create such a type doesn't
> +exist because there is no such thing as a "String ABC" which would
> +allow a suitable type to be built.  Due to this design
> +incompatibility, existing WSGI 1.0 servers, middleware, and
> +applications will not work under Python 3, even after they are run
> +through ``2to3``.
> +
> +Existing Web-SIG discussions about updating the WSGI specification so
> +that it is possible to write a WSGI application that runs in both
> +Python 2 and Python 3 tend to revolve around creating a
> +specification-level equivalence between the Python 2 ``str`` type
> +(which represents a sequence of bytes) and the Python 3 ``str`` type
> +(which represents text).  Such an equivalence becomes strained in
> +various areas, given the different roles of these types.  An arguably
> +more straightforward equivalence exists between the Python 3 ``bytes``
> +type API and a subset of the Python 2 ``str`` type API.  This
> +specification exploits this subset equivalence.
> +
> +In the meantime, aside from any Python 2 vs. Python 3 compatibility
> +issue, as various discussions on Web-SIG have pointed out, the WSGI
> +1.0 specification is too general, providing support (via ``.write``)
> +for asynchronous applications at the expense of implementation
> +complexity.  This specification uses the fundamental incompatibility
> +between WSGI 1.0 and Python 3 as a natural divergence point to create
> +a specification with reduced complexity by changing specialized
> +support for asynchronous applications.
> +
> +To provide backwards compatibility for older WSGI 1.0 applications, so
> +that they may run on a Web3 stack, it is presumed that Web3 middleware
> +will be created which can be used "in front" of existing WSGI 1.0
> +applications, allowing those existing WSGI 1.0 applications to run
> +under a Web3 stack.  This middleware will require, when under Python
> +3, an equivalence to be drawn between Python 3 ``str`` types and the
> +bytes values represented by the HTTP request and all the attendant
> +encoding-guessing (or configuration) it implies.
> +
> +.. note::
> +
> +   Such middleware *might* in the future, instead of drawing an
> +   equivalence between Python 3 ``str`` and HTTP byte values, make use
> +   of a yet-to-be-created "ebytes" type (aka "bytes-with-benefits"),
> +   particularly if a String ABC proposal is accepted into the Python
> +   core and implemented.
> +
> +Conversely, it is presumed that WSGI 1.0 middleware will be created
> +which will allow a Web3 application to run behind a WSGI 1.0 stack on
> +the Python 2 platform.
> +
> +
> +Environ and Response Values as Bytes
> +------------------------------------
> +
> +Casual middleware and application writers may consider the use of
> +bytes as environment values and response values inconvenient.  In
> +particular, they won't be able to use common string formatting
> +functions such as ``('%s' % bytes_val)`` or
> +``bytes_val.format('123')`` because bytes don't have the same API as
> +strings on platforms such as Python 3 where the two types differ.
> +Likewise, on such platforms, stdlib HTTP-related API support for using
> +bytes interchangeably with text can be spotty.  In places where bytes
> +are inconvenient or incompatible with library APIs, middleware and
> +application writers will have to decode such bytes to text explicitly.
> +This is particularly inconvenient for middleware writers: to work with
> +environment values as strings, they'll have to decode them from an
> +implied encoding and if they need to mutate an environ value, they'll
> +then need to encode the value into a byte stream before placing it
> +into the environ.  While the use of bytes by the specification as
> +environ values might be inconvenient for casual developers, it
> +provides several benefits.
> +
> +Using bytes types to represent HTTP and server values to an
> +application most closely matches reality because HTTP is fundamentally
> +a bytes-oriented protocol.  If the environ values are mandated to be
> +strings, each server will need to use heuristics to guess about the
> +encoding of various values provided by the HTTP environment.  Using
> +all strings might increase casual middleware writer convenience, but
> +will also lead to ambiguity and confusion when a value cannot be
> +decoded to a meaningful non-surrogate string.
> +
> +Use of bytes as environ values avoids any potential for the need for
> +the specification to mandate that a participating server be informed
> +of encoding configuration parameters.  If environ values are treated
> +as strings, and so must be decoded from bytes, configuration
> +parameters may eventually become necessary as policy clues from the
> +application deployer.  Such a policy would be used to guess an
> +appropriate decoding strategy in various circumstances, effectively
> +placing the burden for enforcing a particular application encoding
> +policy upon the server.  If the server must serve more than one
> +application, such configuration would quickly become complex.  Many
> +policies would also be impossible to express declaratively.
> +
> +In reality, HTTP is a complicated and legacy-fraught protocol which
> +requires a complex set of heuristics to make sense of. It would be
> +nice if we could allow this protocol to protect us from this
> +complexity, but we cannot do so reliably while still providing to
> +application writers a level of control commensurate with reality.
> +Python applications must often deal with data embedded in the
> +environment which not only must be parsed by legacy heuristics, but
> +*does not conform even to any existing HTTP specification*.  While
> +these eventualities are unpleasant, they crop up with regularity,
> +making it impossible and undesirable to hide them from application
> +developers, as application developers are the only people who are able
> +to decide upon an appropriate action when an HTTP specification
> +violation is detected.
> +
> +Some have argued for mixed use of bytes and string values as environ
> +*values*.  This proposal avoids that strategy.  Sole use of bytes as
> +environ values makes it possible to fit this specification entirely in
> +one's head; you won't need to guess about which values are strings and
> +which are bytes.
> +
> +This protocol would also fit in a developer's head if all environ
> +values were strings, but this specification doesn't use that strategy.
> +This will likely be the point of greatest contention regarding the use
> +of bytes.  In defense of bytes: developers often prefer protocols with
> +consistent contracts, even if the contracts themselves are suboptimal.
> +If we hide encoding issues from a developer until a value that
> +contains surrogates causes problems after it has already reached
> +beyond the I/O boundary of their application, they will need to do a
> +lot more work to fix assumptions made by their application than if we
> +were to just present the problem much earlier in terms of "here's some
> +bytes, you decode them".  This is also a counter-argument to the
> +"bytes are inconvenient" assumption: while presenting bytes to an
> +application developer may be inconvenient for a casual application
> +developer who doesn't care about edge cases, they are extremely
> +convenient for the application developer who needs to deal with
> +complex, dirty eventualities, because use of bytes allows him the
> +appropriate level of control with a clear separation of
> +responsibility.
> +
> +If the protocol uses bytes, it is presumed that libraries will be
> +created to make working with bytes-only in the environ and within
> +return values more pleasant; for example, analogues of the WSGI 1.0
> +libraries named "WebOb" and "Werkzeug".  Such libraries will fill the
> +gap between convenience and control, allowing the spec to remain
> +simple and regular while still allowing casual authors a convenient
> +way to create Web3 middleware and application components.  This seems
> +to be a reasonable alternative to baking encoding policy into the
> +protocol, because many such libraries can be created independently
> +from the protocol, and application developers can choose the one that
> +provides them the appropriate levels of control and convenience for a
> +particular job.
> +
> +Here are some alternatives to using all bytes:
> +
> +- Have the server decode all values representing CGI and server
> +  environ values into strings using the ``latin-1`` encoding, which is
> +  lossless.  Smuggle any undecodable bytes within the resulting
> +  string.
> +
> +- Encode all CGI and server environ values to strings using the
> +  ``utf-8`` encoding with the ``surrogateescape`` error handler.  This
> +  does not work under any existing Python 2.
> +
> +- Encode some values into bytes and other values into strings, as
> +  decided by their typical usages.
> +
> +
> +Applications Should be Allowed to Read ``web3.input`` Past ``CONTENT_LENGTH``
> +-----------------------------------------------------------------------------
> +
> +At [6]_, Graham Dumpleton makes the assertion that ``wsgi.input``
> +should be required to return the empty string as a signifier of
> +out-of-data, and that applications should be allowed to read past the
> +number of bytes specified in ``CONTENT_LENGTH``, depending only upon
> +the empty string as an EOF marker.  WSGI relies on an application
> +"being well behaved and once all data specified by ``CONTENT_LENGTH``
> +is read, that it processes the data and returns any response. That
> +same socket connection could then be used for a subsequent request."
> +Graham would like WSGI adapters to be required to wrap raw socket
> +connections: "this wrapper object will need to count how much data has
> +been read, and when the amount of data reaches that as defined by
> +``CONTENT_LENGTH``, any subsequent reads should return an empty string
> +instead."  This may be useful to support chunked encoding and input
> +filters.
> +
> +
> +``web3.input`` Unknown Length
> +-----------------------------
> +
> +There's no documented way to indicate that there is content in
> +``environ['web3.input']``, but the content length is unknown.
> +
> +
> +``read()`` of ``web3.input`` Should Support No-Size Calling Convention
> +----------------------------------------------------------------------
> +
> +At [6]_, Graham Dumpleton makes the assertion that the ``read()``
> +method of ``wsgi.input`` should be callable without arguments, and
> +that the result should be "all available request content".  Needs
> +discussion.
> +
> +Comment Armin: I changed the spec to require that from an
> +implementation.  I had too much pain with that in the past already.
> +Open for discussions though.
> +
> +
> +Input Filters should set environ ``CONTENT_LENGTH`` to -1
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +At [6]_, Graham Dumpleton suggests that an input filter might set
> +``environ['CONTENT_LENGTH']`` to -1 to indicate that it mutated the
> +input.
> +
> +
> +``headers`` as Literal List of Two-Tuples
> +-----------------------------------------
> +
> +Why do we make applications return a ``headers`` structure that is a
> +literal list of two-tuples?  I think the iterability of ``headers``
> +needs to be maintained while it moves up the stack, but I don't think
> +we need to be able to mutate it in place at all times.  Could we
> +loosen that requirement?
> +
> +Comment Armin: Strong yes
> +
> +
> +Removed Requirement that Middleware Not Block
> +---------------------------------------------
> +
> +This requirement was removed: "middleware components **must not**
> +block iteration waiting for multiple values from an application
> +iterable.  If the middleware needs to accumulate more data from the
> +application before it can produce any output, it **must** yield an
> +empty string."  This requirement existed to support asynchronous
> +applications and servers (see PEP 333's "Middleware Handling of Block
> +Boundaries").  Asynchronous applications are now serviced explicitly
> +by ``web3.async`` capable protocol (a Web3 application callable may
> +itself return a callable).
> +
> +
> +``web3.script_name`` and ``web3.path_info``
> +-------------------------------------------
> +
> +These values are required to be placed into the environment by an
> +origin server under this specification.  Unlike ``SCRIPT_NAME`` and
> +``PATH_INFO``, these must be the original *URL-encoded* variants
> +derived from the request URI.  We probably need to figure out how
> +these should be computed originally, and what their values should be
> +if the server performs URL rewriting.
> +
> +
> +Long Response Headers
> +---------------------
> +
> +Bob Brewer notes on Web-SIG [7]_:
> +
> +    Each header_value must not include any control characters,
> +    including carriage returns or linefeeds, either embedded or at the
> +    end.  (These requirements are to minimize the complexity of any
> +    parsing that must be performed by servers, gateways, and
> +    intermediate response processors that need to inspect or modify
> +    response headers.) [1]_
> +
> +That's understandable, but HTTP headers are defined as (mostly)
> +\*TEXT, and "words of \*TEXT MAY contain characters from character
> +sets other than ISO-8859-1 only when encoded according to the rules of
> +RFC 2047."  [2]_ And RFC 2047 specifies that "an 'encoded-word' may
> +not be more than 75 characters long...  If it is desirable to encode
> +more text than will fit in an 'encoded-word' of 75 characters,
> +multiple 'encoded-word's (separated by CRLF SPACE) may be used." [3]_
> +This satisfies HTTP header folding rules, as well: "Header fields can
> +be extended over multiple lines by preceding each extra line with at
> +least one SP or HT." [1]_
> +
> +So in my reading of HTTP, some code somewhere should introduce
> +newlines in longish, encoded response header values.  I see three
> +options:
> +
> +1. Keep things as they are and disallow response header values if they
> +   contain words over 75 chars that are outside the ISO-8859-1
> +   character set.
> +
> +2. Allow newline characters in WSGI response headers.
> +
> +3. Require/strongly suggest WSGI servers to do the encoding and
> +   folding before sending the value over HTTP.
> +
> +
> +Request Trailers and Chunked Transfer Encoding
> +----------------------------------------------
> +
> +When using chunked transfer encoding on request content, the RFCs
> +allow there to be request trailers.  These are like request headers
> +but come after the final null data chunk.  These trailers are only
> +available when the chunked data stream is finite length and when it
> +has all been read in.  Neither WSGI nor Web3 currently supports them.
> +
> +.. XXX (armin) yield from application iterator should be specify write
> +   plus flush by server.
> +
> +.. XXX (armin) websocket API.
> +
> +
> +References
> +==========
> +
> +.. [1] PEP 333: Python Web Services Gateway Interface
> +   (http://www.python.org/dev/peps/pep-0333/)
> +
> +.. [2] The Common Gateway Interface Specification, v 1.1, 3rd Draft
> +   (http://cgi-spec.golux.com/draft-coar-cgi-v11-03.txt)
> +
> +.. [3] "Chunked Transfer Coding" -- HTTP/1.1, section 3.6.1
> +   (http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.6.1)
> +
> +.. [4] "End-to-end and Hop-by-hop Headers" -- HTTP/1.1, Section 13.5.1
> +   (http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.5.1)
> +
> +.. [5] mod_ssl Reference, "Environment Variables"
> +   (http://www.modssl.org/docs/2.8/ssl_reference.html#ToC25)
> +
> +.. [6] Details on WSGI 1.0 amendments/clarifications.
> +   (http://blog.dscpl.com.au/2009/10/details-on-wsgi-10-amendmentsclarificat.html)
> +
> +.. [7] [Web-SIG] WSGI and long response header values
> +   http://mail.python.org/pipermail/web-sig/2006-September/002244.html
> +
> +Copyright
> +=========
> +
> +This document has been placed in the public domain.
> +
> +
> +
> +..
> +   Local Variables:
> +   mode: indented-text
> +   indent-tabs-mode: nil
> +   sentence-end-double-space: t
> +   fill-column: 70
> +   coding: utf-8
> +   End:
> _______________________________________________
> Python-checkins mailing list
> Python-checkins at python.org
> http://mail.python.org/mailman/listinfo/python-checkins
>


More information about the Python-checkins mailing list