[Web-SIG] Request for Comments on upcoming WSGI Changes

James Bennett ubernostrum at gmail.com
Mon Sep 21 08:03:48 CEST 2009


On Sun, Sep 20, 2009 at 11:25 PM, Chris McDonough <chrism at plope.com> wrote:
> WSGI is a fairly low-level protocol aimed at folks who need to interface a
> server to the outside world.  The outside world (by its nature) talks bytes.
>  I fear that any implied conversion of environment values and iterable
> return values to Unicode will actually eventually make things harder than
> they are now.  I realize that it would make middleware implementors lives
> harder to need to deal in bytes.  However, at this point, I also believe
> that middleware kinda should be hard.  We have way too much middleware that
> shouldn't be middleware these days (some written by myself).

Well, ordinarily I'd be inclined to agree: HTTP deals in bytes, so an
interface to HTTP should deal in bytes as well.

The problem, really is that despite being a very low-level interface,
WSGI has a tendency to leak up into much higher-level code, and (IMO)
authors of that high-level code really shouldn't have to waste their
time dealing with details of the underlying low-level gateway.

You've said you don't want to hear "Python 3" as the reason, but it
provides some useful examples: in high-level code you'll commonly want
to be doing things like, say, comparing parts of the requested URL
path to known strings or patterns. And that high-level code will
almost certainly use strings, while WSGI, in theory, will be using
bytes. That's just a recipe for disaster; if WSGI mandates bytes, then
bytes will have to start "infecting" much higher-level code (since
Python 3 -- rightly -- doesn't let you be nearly as promiscuous about
mixing bytes and strings).

Once I'm at a point where I can use Python 3, I know I'll personally
be looking for some library which will normalize everything for me
before I interact with it, precisely to avoid this sort of leakage; if
WSGI itself would at least *allow* that normalization to happen at the
low level (mandating it is another discussion entirely) I'd feel much
happier about it going forward.


-- 
"Bureaucrat Conrad, you are technically correct -- the best kind of correct."


More information about the Web-SIG mailing list