[Web-SIG] URL quoting in WSGI (or the lack therof)

Thu Jan 24 07:22:06 CET 2008

Phillip J. Eby wrote:
> At 09:15 AM 1/23/2008 -0800, Robert Brewer wrote:
>> I consider it a bug in both, and the difficulty level of changing the
>> CGI behavior really has no bearing on our decision to do better with
>> WSGI. I think it's important that we allow the full range of URI's to be
>> accepted. If you go and stick Apache in front of your WSGI app, it will
>> still 404, sure; but that's your choice to use Apache or not. There's no
>> sense making WSGI a least common denominator, inheriting all the
>> limitations of all the existing web servers.
> 
> Uh, actually, that's sort of the whole point of WSGI - to allow 
> portable applications.  If the spec allows you to do something in 
> theory that's almost never allowed in practice, that's not very helpful.

It could probably work in a good number of implementations, but because 
some gateways could lose or reject the encoding, the deployment becomes 
kind of fragile.

Of course you could argue the same thing about SCRIPT_NAME -- it's 
constantly getting lost and makes deployments seem fragile at times. 
But in contrast to this issue, it's actually quite useful; 
distinguishing %2f and / is more of a corner case.

> MoinMoin, for example, has its own encoding scheme for handling 
> pseudo-slashes in paths, and IMO it's a better way to handle it than 
> trying to rely on finding a server that supports *not* decoding URLs.

We encountered it with GData too, as it uses URLs like 
/{http:%2f%2fexample.com}term/.  But if you balance the {}'s you can 
parse it out.

   Ian