[Web-SIG] The rewritten WSGI pre-PEP
Phillip J. Eby
pje at telecommunity.com
Wed Aug 11 18:44:18 CEST 2004
At 01:42 AM 8/11/04 -0500, Ian Bicking wrote:
>The callables are a little confusing to me. The application is a
>callable. Start_response is a callable. It returns a callable. Of
>course, if it wasn't a callable, it would be an object with only one
>method, which is kind of boring.
>
>A contrary example to this would be iterators, which have basically one
>method in their interface (next); yet they are not simply callables.
It's assumed that iterators may have other behaviors. In any case, I
certainly made use of iterators and methods where appropriate, i.e. in the
return value of the application, which can support __iter__(), next(), and
close() if they are needed.
>I'm not of strong opinion, but the callables definitely make it harder to
>understand.
...but easier to implement, since everything can be done with functions and
closures.
Do you think you would have difficulty creating a conforming
implementation, or are you just saying it took you a while to grasp how you
would do so?
>>==================== =============================================
>>Variable Value
>>==================== =============================================
>>``wsgi.version`` The string ``"1.0"``
>
>Would it make sense for this to be a tuple, like (1, 0), like
>sys.version_info?
Maybe. I'm not sure it makes any difference. I could just as soon drop
versioning altogether and just use the presence or absence of feature keys
as the means of determining the version.
>Another useful one I brought up last time would be some indication that
>the application was definitely not going to be reused, i.e., it's being
>invoked in a CGI context. The performance issues there are completely
>different than in other environments.
Okay... how about 'wsgi.last_call', which is a true value if this
invocation of the application will *probably* be the last? IOW, the server
need not guarantee that the app will *not* be called again; this is just a
"suggestion".
>>.. [2] The Common Gateway Interface Specification, v 1.1, 3rd Draft
>> (http://cgi-spec.golux.com/draft-coar-cgi-v11-03.txt)
>
>I think before we discussed being explicit about a couple variables.
>Specifically that SCRIPT_NAME should refer to the application's root, and
>PATH_INFO to everything that comes after.
Good point; I'll update this.
>Should there be any policy about path segments containing //, ./, or ../?
What do you have in mind?
>Hmm... what should the server do if it gets a Location header with no Status?
There's no such thing; there's always a status under this spec. However,
what happens to the HTTP headers passed to 'start_response()' could perhaps
be made clearer.
>The CGI spec says servers should change the current working directory to
>the resource being run. I think this won't be that common for WSGI
>servers, though.
Do you think this needs to be stated? WSGI only references CGI with
respect to environment variables.
>Will GATEWAY_INTERFACE be defined? If so, what value? "WSGI/1.0"? I
>assume SERVER_SOFTWARE will be up to the WSGI server. Should they be sure
>to rewrite this value if these servers are nested? E.g., should your CGI
>example rewrite that value? It seems like each piece adds another name to
>the end in the format "name/version_number", where the name has no
>spaces. And it might optionally have more information in parenthesis
>after the version, which may contain spaces. Maybe this should be a
>suggestion.
The normal value of the CGI variables should be server-defined. WSGI
variables should be out-of-band.
>Is there any non-parsed header form?
The entire thing is "non-parsed headers". They're a list of tuples. If
you mean, can you stop a web server from adding/changing headers according
to its whims, then no, you can't.
>This is from the CGI spec:
>
> Scripts MUST be prepared to handled URL-encoded values in
> metavariables. In addition, they MUST recognise both "+" and
> "%20" in URL-encoded quantities as representing the space
> character. (See section 3.1.)
>
>That seems weird; I've never URL-decoded values besides QUERY_STRING.
That's probably an addition to the 1.1 spec. However, ISTM I've seen code
in Zope that expects to decode path segments. I could be wrong.
>The CGI spec doesn't seem to mention REQUEST_URI. That's surprising.
>Here's the Apache CGI variables it doesn't mention:
>
>SERVER_SIGNATURE (pretty boring)
>SERVER_ADDR (seems very basic)
>DOCUMENT_ROOT (doesn't seem appropriate)
>SCRIPT_FILENAME (also often not appropriate)
>SERVER_ADMIN (boring)
>SCRIPT_URI
>REQUEST_URI (I don't understand the distinction)
>REMOTE_PORT (boring, though I guess if you wanted to add an ident check it
>would be useful)
>UNIQUE_ID (not needed)
>
>
>I think SERVER_ADDR and REMOTE_PORT are easy to add, and potentially
>useful. SCRIPT_URI and REQUEST_URI might be good.
Sigh. I guess maybe I'll have to go back and pick out variables one by
one. However, I don't think *any* of the variables you listed should be
required to exist. For one thing, it's much easier to write middleware if
you only have to munge SCRIPT_NAME and PATH_INFO during traversals.
More information about the Web-SIG
mailing list