[Web-SIG] Re: Bill's comments on WSGI draft 1.4

Bill Janssen janssen at parc.com
Thu Sep 2 21:47:03 CEST 2004


I think we need some terminology that I don't remember seeing.  There
are two sides to WSGI, the server side, which I'll call the "socket",
and the framework side, which I'll call the "plug".  If there are
other terms already in use, please let me know.

Let me ask first, has anyone written a "socket" layer for Medusa?

> >1.  The "environ" parameter must be a Python dict: I think subclasses
> >should be allowed.
> [...various reasons why this might be a bad idea are introducted...]
> These are "practicality beats purity" argument, so I need to see some 
> *practical* applications of dictionary subclasses that would be useful 
> enough to outweigh both of the above issues.

Phillip, these are good engineering reasons for socket developers not
to use subclasses, but that restriction doesn't belong in WSGI.  They
may have other reasons for using subclasses that we haven't thought of
(perhaps because they're using these dicts for additional purposes
besides WSGI), and they should be allowed to use them.  You don't want
to try to fix things out of scope of this work.

> Because 'file' has a 'fileno' attribute, 'isinstance(f,file)' implies 
> 'hasattr(f,"fileno")'.  Therefore, the latter is the preferred behavior 
> here, because it doesn't unnecessarily exclude other valid wrappers of file 
> descriptors.

I'm not familiar with all the ins and outs of files on Python and
Jython and IronPython, so I'll just say, reasonable enough.  Though
I'd prefer to say, a file-like object (whatever that means).

> These restrictions are intended to simplify servers and middleware; nobody 
> has yet presented an example of a scenario where this imposed any practical 
> limitation.

Here's a scenario for you: I want to return a valid HTTP header that
your WSGI layer doesn't allow!  For example, accented Latin-1
characters, which are valid in the Reason-Phrase.  Or for another
example, a multi-line header value, which I actually use quite a bit,
and which is perfectly valid in HTTP, and which your prohibition on
control characters in header values breaks.

> The fallback position would be that the status string and headers must not 
> be CR or CRLF terminated.

The fallback position would be fine.

> Are you aware of any 
> applications that currently fold their headers, or transmit ISO-8859-1 
> characters without using the encoding prescribed by RFC 2047?  Is there a 
> practical use case for either one?

Whether or not our limited group currently knows of such a case is
immaterial.  This is an overly restrictive limitation with nothing,
I'm afraid, but religion for its justification.  Aside from clueless
implementors (against which the gods themselves strive in vain), why
would allowing any valid header value be a problem?

> * In order to ensure safe interpretation, smart middleware and server 
> developers will have to write routines to *unfold* potentially-folded 
> headers; why not just disallow folding to begin with?

Because it's allowed in the HTTP spec, and this is a general-purpose
HTTP framework layer.

> How about "must provide the *option*" and "must be enabled by default"? Or, 
> leave it as is, but add something like, "may provide the user with the 
> option of suppressing this output, so that users who cannot fix a broken 
> application are not forced to bear the pain of its error."

That's fine with me.

> >6.  The "write()" callable is important; it should not be deprecated
> >or in some other way made a poor stepchild of the iterable.
> 
> But it *is* one.  The presence of the 'write()' facility significantly 
> increases the implementation complexity for middleware and server 
> authors.  If it weren't necessary to support existing streaming APIs, it 
> wouldn't exist.

But supporting streaming APIs is an important consideration, from the
point of view of authors actually writing code against a framework.
It should be a peer methodology (or completely removed).

Again, WSGI is a very general mechanism, which should provide
mechanism, not enforce policy.  That's the only way to get it widely
accepted in all the server and framework projects.  If you don't like
the streaming model, write editorials about it, but don't try to
cripple other people's software.

> However, the language should perhaps be clarified to be explicit about this 
> point

Yes.

> and to address what happens if code *within* the iterator calls 
> 'write()'.  (I don't think it should be allowed to, but I'm open to 
> arguments either way.)

Good point.  I tend to agree with you here.

> This seems at odds with your previous desire to use RFC 2616, which is 
> pretty clear that it's ISO-8859-1 or RFC 2047.  PEP 333 goes further and 
> says, it's ASCII, dammit, and use MIME header encodings (RFC 2047) if you 
> need to do something special, because God help you if you're trying to mess 
> with non-ASCII in HTTP headers and you don't know how to deal with that stuff.

My problem here is not with PEP 333, but with Python strings in
general.  The only string type which carries an associated charset tag
is Unicode.  The byte strings are *some* string encoded in *some*
character set encoding, but no one knows which encoding, for any given
byte string.  I meant to say that the characters used should be
restricted to those specified in RFC 2616, but those characters should
be passed in Unicode strings, so that we can safely apply the
.encode() method to them.  But simply specifying that the byte strings
conform to RFC 2616 would be OK with me.  As I say, with the current
Python, our options are limited.

> Glad there was something you liked.  ;)  (j/k)

Hey, there was lots I liked!  Most of my suggestions were about
removing restrictions on areas outside of WSGI, I think.

> I rather like this, although I don't at all see how FTP gets into 
> this.  What the heck would CGI variables for FTP look like, I 
> wonder?  Anyway, it's handy for "http" and "https" at the very least.  I'd 
> prefer "wsgi.url_scheme" for the name, though, as it's otherwise a somewhat 
> ambiguous name.

Sure, that's fine with me.  As for "ftp", I was thinking of Medusa,
which supports serving a number of protocols with the same framework.

Bill





More information about the Web-SIG mailing list