[Web-SIG] Web Container Interface

Thu Jan 29 01:58:17 EST 2004

On Jan 28, 2004, at 11:27 PM, Jacob Smullyan wrote:
>> Obviously some
>> containers (ack, too many alternative terminologies at this point) 
>> will
>> not be able to finish the request until after control has been 
>> returned
>> from runCGI, but I don't see how that can be helped.
>
> It may be true that it can't be helped -- but it isn't obvious to me,
> yet.  Is there anything about the assumptions of some frameworks that
> would prevent a file-like object they are writing to from doing some
> networking when they close it?  I would think that is the business of
> the container, not the containee.  Or do you mean that the container
> may be in turn nested in a preexisting environment, not optimized
> specifically for this sort of containment, to which it delegates its
> networking responsibilities -- the "run Zope inside WebWare" scenario?
> That kind of scenario shouldn't be a sticking point, in my view,
> because it will always be suboptimal.  If you want coexistence of two
> different application environments, I'd expect to do better nesting
> them both in one shared container which has limited, specialized
> functionality than arranging them serially (Zope running inside
> WebWare running inside SkunkWeb running inside...).  A lower level of
> compliance, for such "convenience serial containers", would be
> forgiveable.

I almost took it back, that all containers should be able to do a 
network response when you close.  But I'm not sure about plain CGI -- 
the only way to finish a CGI request may be to end the process.  
Everything else should be able to finish the response when .close() is 
called.

But Zope in Webware, or Webware in Zope, or what-have-you, it should be 
possible.  If it's a nested container, it should be able to pass the 
.close() on up, until you've reached the top.

>> In general, fewer gateways would have to buffer output until after
>> control was returned, if headers weren't included in the output 
>> stream.
>>  You could parse the headers at the soonest moment, and then connect
>> the application to the actual client in a more direct fashion after
>> that time.  I suppose that only requires looking for \n\n (or a chunk
>> that ends in \n, and another that starts in \n), but it's still
>> annoying.  Anyway, if a container sends the complete request when
>> stdout.close() was called, control would at least temporarily be 
>> passed
>> to the container, while the application would still have a chance to 
>> do
>> some processing after stdout.close returns, and before runCGI returns.
>> Maybe those semantics -- or even a lack of required semantics -- 
>> should
>> be included in the PEP.
>
> Yes, that is really what I'd like to see clarified.  At the very
> least, a container should announce whether its output object really
> outputs when it says it does.

Flushing can't be guaranteed, at least not in general.  There's too 
many places where buffering can happen, and they aren't all easily 
accessible.  I know Apache does some buffering which I haven't been 
able to get around.  In most cases a little extra buffering doesn't 
hurt.  Heck, I sometimes wonder if the browser buffers a bit...

> My focussing on flush() rather than close() was the wrong emphasis.
> The reason I made that mistake was that I assumed that the container
> was not in the business of worrying about http, but that that was the
> responsibility of an adapter sitting between the application proper
> and the container (that the container was in no-parsed-header mode, in
> cgi terms); in that case, the container has no reason not to flush
> when the client toggles the handle, rather than waiting for him to get
> off the pot.  Upon reflection, I think the container *should* parse
> headers by default -- what a bore for every adapter to have to do
> this.

I think it's implied that the container will parse the headers, since 
that's what we're all used to from CGI.  This should consist mostly of 
the Status header, and maybe the Location header.  I've never been 
clear on the Location header, though -- it's semantics are very unclear 
to me in the absence of a Status header.  I think with Apache you get 
an internal redirect if you give a path without a host name, and an 
external temporary redirect otherwise.  But in some environments I 
think it always becomes an external redirect, and relative paths are 
resolved to absolute paths before being sent to the client.  That might 
be nice to define.  Or to strongly encourage applications to only use 
fully qualified redirects, and to always give a Status header unless 
doing a 200 response.

> However, it would be nice if in runCGI you could set an
> attribute of output that would tell it whether to parse headers or
> not; it would seem reasonable to at least aspire to the range of
> functionality already implemented by CGI :).  But I don't know if
> Mr. Eby wants his nice abstract output object mucked up with pesky
> attributes!  (This suggests a T-shirt: "Get your attributes off my
> object!" -- "attributes" being more polite, if less general, than
> "members".)

Yes, I suspect he won't like that addition ;)  I think it would be 
common for the container to add Keep-Alive headers, and maybe some 
others.  And in fact, the application should not add those headers, 
it's really something the container needs to abstract.  So the response 
doesn't belong to the application alone.

It would be nice to better specify what kind of parsing will occur, 
what headers might be added, which are off limits (to application or to 
container).  CGI is poorly specified.

--
Ian Bicking | ianb at colorstudy.com | http://blog.ianbicking.org