[Web-SIG] WSGI Utils & SCGI/Quixote.

Fri Dec 3 01:57:38 CET 2004

Ian Bicking wrote:

> Phillip J. Eby wrote:
>
>> I think I'm going to have to call that point out in the PEP 
>> somewhere.  Technically, the PEP requires that SCRIPT_NAME and 
>> PATH_INFO be set, but I think perhaps some folks have missed the 
>> implications of that for the URL path space.
>>
>> Perhaps something like this would do the trick:
>>
>> """
>> Application Placement in Server URL Space
>> -----------------------------------------
>>
>> In order to generate correct SCRIPT_NAME and PATH_INFO variables, 
>> servers and gateways MUST treat an application's location as a URL 
>> path prefix.  That is, servers and gateways:
>>
>> * MUST determine the target application using a matching prefix of 
>> the request path (which then determines the value of SCRIPT_NAME).
>>
>> * MUST take the remaining portion of the request path, and use it to 
>> determine PATH_INFO. (Note that the remainder must be empty or begin 
>> with a '/', otherwise the prefix match was invalid!)
>>
>> * MUST assume that there are an infinite number of possible URL paths 
>> that may appear as a PATH_INFO suffix "beneath" the application's 
>> base URL
>
>
> I think this is too restrictive.  It's the natural way to do things in 
> most cases, 

It is the natural way, and it is not very restrictive.

> but there's no reason to enforce it.  

Reason #1: "You really only need to do it one way"  which is the entire 
point of the WEB-SIG.
Reason #2:  If you don't specify one well-documented, easily-implemented 
way, you will get a dozen poorly-implemented, poorly-documented ways.

> E.g., a mod_rewrite-like middleware might do any number of things; 
> it's a use-at-your-own-risk proposition (with considerable risk, at 
> least from my own mod_rewrite experiences), but it shouldn't be 
> disallowed, and this appears to disallow that kind of code.

Reason #3: mod_rewrite is the problem. an understandable mapping 
convention is the solution.

[snip]

> The login middleware catches it, sees that it's configured for 
> cookie-based (form) login, and turns it into a 200 with a login form.  

that should be a "303 See Other" pointing to the login form. 

<>> Or later, if they try to login but fail, their URL may still be 
pointing at the original application (useful if they were submitting a 
POST form,

not really, because you still lost the original POST data.

... Unless the login middleware also saved that to a "conditional post" 
queue like Fastmail.FM does if your session times out while you are 
composing a message. 

(IMO, every successful POST SHOULD respond with 303 -- avoiding 90% of 
all double posts. Unsuccessful POSTs should send 200, with the original 
form already filled out with the info that was correct, and error 
messages where it was not.)

> which you want to pass through to the original URL, and it's difficult 
> to do that with a redirect-after-submit).

If you use cookie-based authentication, the user can usually just hit 
the back button twice and POST again. (not fun if they were uploading 
big files, but otherwise harmless, because the orignal post "failed" and 
no unsafe action was taken.)

> There's a bunch of other ways this could be factored, but a number of 
> them involve dispatching to an application based on query string, or 
> in some way where SCRIPT_NAME and PATH_INFO don't have any relation to 
> the application at all.

And those other ways create UGLY urls, which enticed someone to create 
mod_rewrite to make them pretty. Search engines are getting better at 
making sense of that ugliness, but the URL space is still not very RESTful.

Keep in mind we are talking about the *container* doing the 
dispatching.  Once the servlet is selected, it can do anything it wants 
with the PATH_INFO and query string, including forwards and includes and 
redirects.  If the application wants to do crazy dispatching within its 
own URL space, that's fine, but the container shouldn't need to deal 
with that.

>
> So I'd say these should all be SHOULDs, not MUSTs.  Or they should 
> simply be put in as implementation recommendations.  

That's what the Java people said sometime before Servlet Version 2.2.  
But they tightened it up based on experience:

--------------SRV.10 (v.2.2)--------------
Previous versions of this specification have allowed servlet containers 
a great deal of flexibility in mapping client requests to servlets only 
defining a set a suggested mapping techniques. This specification *now 
requires* a set of mapping techniques to be used for web applications 
which are deployed via the Web Application Deployment mechanism. Just as 
it is highly recommended that servlet containers use the deployment 
representations as their runtime representation, it is highly 
recommended that they use these path mapping rules in their servers for 
all purposes and not just as part of deploying a web application.
--------------
--------------SRV.11 (v.2.4)--------------
The mapping techniques described in this chapter are *required* for Web 
containers mapping client requests to servlets.

(Previous versions of this specification made use of these mapping 
techniques as a suggestion rather than a requirement, allowing servlet 
containers to each have their different schemes for mapping client 
requests to servlets.)
--------------

http://jdiworks.net/projects/servlet/SRV.11.html
http://jdiworks.net/projects/servlet/SRV.4.4.html

Let's learn from their experience.

---
Terrel Shumway
"That Web Guy Who Knows Marketing"
http://jdiworks.net/