[Web-SIG] PEP 444 != WSGI 2.0

Mon Jan 3 00:48:07 CET 2011

At 02:21 PM 1/2/2011 -0800, Alice BevanMcGregor wrote:
>On 2011-01-02 11:57:19 -0800, P.J. Eby said:
>>* -1 on the key-specific encoding schemes for the various CGI 
>>variables' values -- for continuity of coding (not to mention 
>>simplicity) PEP 3333's approach to environ encodings should beused.
>>(That is, the environ consists of bytes-in-unicode-form, rather 
>>than true unicode strings.)
>
>Does ISO-8859-1 not accomodate this for all but a small number of 
>the environment variables in PEP 444?

PEP 3333 requires that environment variables contain the bytes of the 
HTTP headers, decoded using ISO-8859-1.  The unicode strings, in 
other words are restricted to code points in the 0-255 range, and are 
really just a representation of bytes, rather than being a unicode 
decoding of the contents of the bytes.

What I saw in your draft of PEP 444 (which I admittedly may be 
confused about) is language that simply loosely refers to unicode 
environment variables, which could easily be misconstrued as meaning 
that the values could actually contain other code points.

To be precise, in PEP 333, the "true" unicode value of an environment 
variable is:

     environ[key].encode('iso-8859-1').decode(appropriate_encoding_for_key)

Whereas, my reading of your current draft implies that this has to 
already be done by the server.

As I understand it, the problem with this is that the server 
developer can't always provide such a decoding correctly, and would 
require that the server guess, in the absence of any information that 
it could use to do the guessing.  An application developer is in a 
better position to deal with this ambiguity than the server 
developer, and adding configuration to the server just makes 
deployment more complicated, and breaks application composability if 
two sub-applications within a larger application need different decodings.

That's the rationale for the PEP 3333 approach -- it essentially 
acknowledges that HTTP is bytes, and we're only using strings for the 
API conveniences they afford.

>>* Where is the PARAMETERS variable defined in the CGI spec?
>>Whatservers actually support it?
>
>It's defined in the HTTP spec by way of 
>http://www.ietf.org/rfc/rfc2396.txt URI Syntax.  These values are 
>there, they should be available.  (Specifically semi-colon separated 
>parameters and hash-separated fragment.)

I mean, what web servers currently provide PARAMETERS as a CGI 
variable?  If it's not a CGI variable, it doesn't go in all caps.

What's more, the spec you reference points out that parameters can be 
placed in *each* path-segment, which means that they would:

1) already be in PATH_INFO, and
2) have multiple values

So, -1 on the notion of PARAMETERS, since AFAICT it is redundant, not 
CGI, and would only hold one parameter segment.

>>* The language about empty vs. missing environment variables 
>>appears to have disappeared; without it, the spec is ambiguous.
>
>I will re-examine the currently published PEP 444.

I don't know if it's in there or not; I've read your spec more 
thoroughly than that one.  I'm referring to the language from PEP 333 
and its successor, with which I'm much more intimately familiar.

>Indeed.  I do try to understand the issues covered in a broader 
>scope before writing; for example, I do consider the ability for new 
>developers to get up and running without worrying about the example 
>applications they are trying to use work in their version of Python; 
>thus /allowing/ native strings to be used as response values on Python 3.

I don't understand.  If all the examples in your PEP use b'' strings 
(per the 2.6+ requirement), where is the problem?

They can't use WSGI 1(.0.1) code examples at all (as your draft isn't 
backward-compatible), so I don't see any connection there, either.

>Byte strings are still perferred, and may be more performant,

Performance was not the primary considerations; they were:

* One Obvious Way
* Refuse The Temptation To Guess
* Errors Should Not Pass Silently

The first two would've been fine with unicode; the third was the 
effective tie-breaker.  (Since if you use Unicode, at some point you 
will send garbled data and end up with an error message far away from 
the point where the error occurred.)

>I certainly will; I just need to see concrete points against the 
>technical merits of the rewritten PEP

Well, I've certainly given you some, but it's hard to comment other 
than abstractly on an async spec you haven't proposed yet.  ;-)

Nonetheless, it's really important to understand that the PEP process 
(especially for Informational-track standards) is not so much about 
technical merits in an absolute sense, as it is about *community consensus*.

And that means it's actually a political and marketing process at 
least as much as it is a technical one.  If you miss that, you may 
well end up with a technically-perfect spec (in the sense that nobody 
gives you any additional "concrete points against the technical 
merits"), that nobody cares to actually *implement*.

And from a marketing perspective, the people who must "buy" a WSGI 2 
spec are the *server implementers*.

Sure, if you have a well-defined mapping from WSGI 1.0.x to your 
spec, then you could write a wrapper that provides your spec in any 
WSGI 1.0.x server -- and you can then promote that API for people to 
use.  However, you would then be in the web development API business, 
competing against dozens of existing web frameworks for 
mindshare.  Why use it over any other such API?

So, in order to make the spec a meaningful point of coordination, 
both sides of the spec need some reasonable expectation that the 
people on the other side are really going to use/implement it.

But the *costs* of implementing are asymmetric: writing code against 
(any given spec for) WSGI 2 is a lower commitment than actually 
implementing the corresponding WSGI 2 server.  Which means it's 
harder to convince a server implementer to just do something 
different because it's nicer for the end-user.  (Nice for the 
end-user, after all, is the application framework's job, from the 
server developer's POV!)

This isn't to say you shouldn't make things nicer for users; but your 
spec will likely be successful to the degree it *also* makes things 
nicer for server developers...  like Graham and Robert, among others.

So, the good news is, your real target audience is already here, 
listening, and occasionally commenting on these matters.  The bad 
news is, making them happy has less to do with technical merit per 
se, and much more to do with how much work they see your spec making 
for them.  Welcome to the PEP process.  ;-)

Hm.  That didn't sound quite right: it's not that server developers 
don't care about technical merit.  It's that the technical merits 
they're most interested in are going to fall on the side of things 
like needing to handle ambiguities at the interface between HTTP and 
the WSGI spec.  So a new spec that doesn't address *known* issues 
with HTTP-WSGI relations may be perceived by some as pointless noise 
that isn't helping anything...  and from *their* perspective, they're right.

While it'd be nice if the burden of proof were on them to tell you 
where your spec is wrong, this simply isn't practical.  Reviewing 
every spec that comes along is time-consuming and *hard*.  (To be 
quite honest, I was previously hoping that somebody else besides me 
would do the initial reviewing of your spec -- and I don't plan to 
review every draft or do a super-detailed assessment unless it 
clearly becomes necessary!)

And, whether it's true or fair, a quick glance at your draft can 
easily give the impression to a long-time Web-SIGger that you haven't 
studied the pitfalls that Graham has so extensively documented in the 
past, or that if you did, you didn't give much thought to resolving them.

Again, I don't know if you've done so, but if you have, it's not 
immediately apparent to me from your current draft. And it really is 
up to you to promote and defend your PEP, not up to others to shoot it down.

All that being said, I've pitched in today to give some feedback and 
support because I *want you to succeed*.  The disadvantage to being 
an old hand at this is that it's easy to be discouraged by one's own 
extensive knowledge of the problems, and it's great to have somebody 
with some fresh enthusiasm stepping up to the plate.  (Lord knows I 
don't want to have to write the damn thing myself!)

But that won't change the part where to get *other* people on board, 
you're going to have to convince them that you have a credible 
solution to the problems *they* care about, not just the ones you 
care about.  Writing a rationale that will convince Graham and the 
others that you've got a solution to the problems they've already 
posted about so many times is really Job 1 here if you want to get 
their buy-in.

If you get stuck on some element of what they're asking 
for/complaining about, feel free to post a question here, and I will 
certainly chime in and try to help out.  However, if you keep 
thinking about the part where the overall process is inherently 
unfair to you -- which I totally agree it *is* -- it's not going to 
do anything but stress you out.

And I don't want *that*, because I don't want you to get burned out 
before you do the job of writing the next spec for me.  ;-)