[Web-SIG] The rewritten WSGI pre-PEP

Wed Aug 11 18:44:18 CEST 2004

At 01:42 AM 8/11/04 -0500, Ian Bicking wrote:

>The callables are a little confusing to me.  The application is a 
>callable.  Start_response is a callable.  It returns a callable.  Of 
>course, if it wasn't a callable, it would be an object with only one 
>method, which is kind of boring.
>
>A contrary example to this would be iterators, which have basically one 
>method in their interface (next); yet they are not simply callables.

It's assumed that iterators may have other behaviors.  In any case, I 
certainly made use of iterators and methods where appropriate, i.e. in the 
return value of the application, which can support __iter__(), next(), and 
close() if they are needed.

>I'm not of strong opinion, but the callables definitely make it harder to 
>understand.

...but easier to implement, since everything can be done with functions and 
closures.

Do you think you would have difficulty creating a conforming 
implementation, or are you just saying it took you a while to grasp how you 
would do so?

>>====================   =============================================
>>Variable               Value
>>====================   =============================================
>>``wsgi.version``       The string ``"1.0"``
>
>Would it make sense for this to be a tuple, like (1, 0), like 
>sys.version_info?

Maybe.  I'm not sure it makes any difference.  I could just as soon drop 
versioning altogether and just use the presence or absence of feature keys 
as the means of determining the version.

>Another useful one I brought up last time would be some indication that 
>the application was definitely not going to be reused, i.e., it's being 
>invoked in a CGI context.  The performance issues there are completely 
>different than in other environments.

Okay...  how about 'wsgi.last_call', which is a true value if this 
invocation of the application will *probably* be the last?  IOW, the server 
need not guarantee that the app will *not* be called again; this is just a 
"suggestion".

>>.. [2] The Common Gateway Interface Specification, v 1.1, 3rd Draft
>>    (http://cgi-spec.golux.com/draft-coar-cgi-v11-03.txt)
>
>I think before we discussed being explicit about a couple variables. 
>Specifically that SCRIPT_NAME should refer to the application's root, and 
>PATH_INFO to everything that comes after.

Good point; I'll update this.

>Should there be any policy about path segments containing //, ./, or ../?

What do you have in mind?

>Hmm... what should the server do if it gets a Location header with no Status?

There's no such thing; there's always a status under this spec.  However, 
what happens to the HTTP headers passed to 'start_response()' could perhaps 
be made clearer.

>The CGI spec says servers should change the current working directory to 
>the resource being run.  I think this won't be that common for WSGI 
>servers, though.

Do you think this needs to be stated?  WSGI only references CGI with 
respect to environment variables.

>Will GATEWAY_INTERFACE be defined?  If so, what value?  "WSGI/1.0"?  I 
>assume SERVER_SOFTWARE will be up to the WSGI server.  Should they be sure 
>to rewrite this value if these servers are nested?  E.g., should your CGI 
>example rewrite that value?  It seems like each piece adds another name to 
>the end in the format "name/version_number", where the name has no 
>spaces.  And it might optionally have more information in parenthesis 
>after the version, which may contain spaces.  Maybe this should be a 
>suggestion.

The normal value of the CGI variables should be server-defined.  WSGI 
variables should be out-of-band.

>Is there any non-parsed header form?

The entire thing is "non-parsed headers".  They're a list of tuples.  If 
you mean, can you stop a web server from adding/changing headers according 
to its whims, then no, you can't.

>This is from the CGI spec:
>
>    Scripts MUST be prepared to handled URL-encoded values in
>    metavariables. In addition, they MUST recognise both "+" and
>    "%20" in URL-encoded quantities as representing the space
>    character. (See section 3.1.)
>
>That seems weird; I've never URL-decoded values besides QUERY_STRING.

That's probably an addition to the 1.1 spec.  However, ISTM I've seen code 
in Zope that expects to decode path segments.  I could be wrong.

>The CGI spec doesn't seem to mention REQUEST_URI.  That's surprising. 
>Here's the Apache CGI variables it doesn't mention:
>
>SERVER_SIGNATURE (pretty boring)
>SERVER_ADDR (seems very basic)
>DOCUMENT_ROOT (doesn't seem appropriate)
>SCRIPT_FILENAME (also often not appropriate)
>SERVER_ADMIN (boring)
>SCRIPT_URI
>REQUEST_URI (I don't understand the distinction)
>REMOTE_PORT (boring, though I guess if you wanted to add an ident check it 
>would be useful)
>UNIQUE_ID (not needed)
>
>
>I think SERVER_ADDR and REMOTE_PORT are easy to add, and potentially 
>useful.  SCRIPT_URI and REQUEST_URI might be good.

Sigh.  I guess maybe I'll have to go back and pick out variables one by 
one.  However, I don't think *any* of the variables you listed should be 
required to exist.  For one thing, it's much easier to write middleware if 
you only have to munge SCRIPT_NAME and PATH_INFO during traversals.