[Web-SIG] URL quoting in WSGI (or the lack therof)
Luis Bruno
lbruno at 100blossoms.com
Tue Jan 22 23:33:57 CET 2008
Ian Bicking pointed at CGI 1.1 saying: "See? The WSGI spec tells me to
do this!" And he's right. This sub-thread is about *me* thinking the
*WSGI spec* should be *fixed*.
James Y Knight wrote:
> Where does the CGI spec forbid multiple segments in PATH_INFO?
> It doesn't. It actually says that PATH_INFO is made by joining each
> decoded path-segment with a /.
My fault. I misread this:
The server MAY reject the request with an error if it encounters
any values considered objectionable. That MAY include any requests
that would result in an encoded "/" being decoded into PATH_INFO, as
this might represent a loss of information to the script.
Still, my problem is that "loss of information"; I no longer know
which '/' were %-encoded.
> And as far as I know /every/ extant implementation does this.
As does Paste#http. My fault for not reading correctly.
> Besides, the workaround is quite simple: don't use %2F characters in your urls.
Should I use $2F? I already *have* an escaping mechanism... which I'm
using for spaces, BTW. Why can't I use it for slashes? I came to
web-sig@ to fix the spec, not to find a workaround. I already *have* a
workaround: it starts with me monkeying around Paste#http and rolling
my own dispatcher. Not too bright though, as I could have slapped a
$2F in there for a quick workaround (thank you Brian).
A quick sanity check here: I think
http://host/catalog/some%2Fthing/shallow/ is *meant* to have two
nested levels: "some/thing" and "shallow". Is it obvious to you to
interpret the URL as having three nested levels "some", "thing" and
"shallow"? I ask because the first choice is very obvious to me; I'm
treating the second one (current behaviour) as a bug to be fixed.
Anyone else thinks it's a bug in WSGI too?
--
Luis Bruno
More information about the Web-SIG
mailing list