[Web-SIG] URL quoting in WSGI (or the lack therof)

James Y Knight foom at fuhm.net
Tue Jan 22 20:22:07 CET 2008


On Jan 22, 2008, at 1:02 PM, Luis Bruno wrote:

>
> Fortunately, the URI spec doesn't repeat the mistake of forbidding
> %-encoding characters. It does mention that each path-segment should  
> be
> separately %-decoded, going against the CGI spec which actually  
> forbids
> multiple segments *in PATH_INFO*. That smells of mistake. Faced with  
> the
> choice between those specs, I'd prefer not to lose information for
> mindless compliance with CGI.
>

Where does the CGI spec forbid multiple segments in PATH_INFO? It  
doesn't. It actually says that PATH_INFO is made by joining each  
decoded path-segment with a /. And as far as I know /every/ extant  
implementation does this. And the high quality ones forbid a / from  
appearing in the decoded segment (aka, from a %2F in the original  
url), in order to avoid security issues.

So I'm not sure what this thread is about. You can argue that the CGI  
spec has a bug in it, but it's not like this is a new issue or  
something, and it's shared by every system based on CGI. (PHP for  
example has the same issue).

Besides, the workaround is quite simple: don't use %2F characters in  
your urls.

James


More information about the Web-SIG mailing list