[Web-SIG] URL quoting in WSGI (or the lack therof)

Luis Bruno lbruno at 100blossoms.com
Tue Jan 22 23:33:57 CET 2008


Ian Bicking pointed at CGI 1.1 saying: "See? The WSGI spec tells me to
do this!" And he's right. This sub-thread is about *me* thinking the
*WSGI spec* should be *fixed*.


James Y Knight wrote:
> Where does the CGI spec forbid multiple segments in PATH_INFO?
> It doesn't. It actually says that PATH_INFO is made by joining each
> decoded path-segment with a /.

My fault. I misread this:

   The server MAY reject the request with an error if it encounters
   any values considered objectionable.  That MAY include any requests
   that would result in an encoded "/" being decoded into PATH_INFO, as
   this might represent a loss of information to the script.

Still, my problem is that "loss of information"; I no longer know
which '/' were %-encoded.


> And as far as I know /every/ extant implementation does this.

As does Paste#http. My fault for not reading correctly.


> Besides, the workaround is quite simple: don't use %2F characters in your urls.

Should I use $2F? I already *have* an escaping mechanism... which I'm
using for spaces, BTW. Why can't I use it for slashes? I came to
web-sig@ to fix the spec, not to find a workaround. I already *have* a
workaround: it starts with me monkeying around Paste#http and rolling
my own dispatcher. Not too bright though, as I could have slapped a
$2F in there for a quick workaround (thank you Brian).



A quick sanity check here: I think
http://host/catalog/some%2Fthing/shallow/ is *meant* to have two
nested levels: "some/thing" and "shallow". Is it obvious to you to
interpret the URL as having three nested levels "some", "thing" and
"shallow"? I ask because the first choice is very obvious to me; I'm
treating the second one (current behaviour) as a bug to be fixed.


Anyone else thinks it's a bug in WSGI too?
-- 
Luis Bruno


More information about the Web-SIG mailing list