[Web-SIG] Request for Comments on upcoming WSGI Changes

Tue Sep 22 01:09:52 CEST 2009

On Mon, Sep 21, 2009 at 03:26:35PM -0700, Robert Brewer wrote:
> It looks simpler until you have a site that is not primarily utf-8. In
> that case, you multiply your (1 line * number of middlewares in the WSGI
> stack * each request).
> With wsgi.uri_encoding you get either (1 line * 1
> middleware designed to transcode * each request), or even 0 if your
> whole site uses just one charset.

I am not sure I understand your point.

The 0 lines hold true if the whole site is using latin-1 or utf-8 and
you write your applications/middlewares only for this site. But if it's
using any other encoding you still have to transcode.

def middleware(start_response, environ):
    value = environ['some_key'].\
        encode('utf8', 'surrogateescape').\
        decode(SITE_ENCODING)
    ...

With wsgi.uri_encoding you would still have to do the following:

def middleware(start_response, environ):
    value = environ['some_key'].\
        encode(environ['some_key.encoding']).\
        decode(SITE_ENCODING)
    ...

Of course you can directly use `environ['some_key']` if you know you'll
get the 'right' encoding all the time. But when the encoding changes,
you'll have to fix all your middlewares.

I am missing something?

-- 
  Henry Prêcheur