[Web-SIG] ngx.poll extension (was Re: Are you going to convert Pylons code into Python 3000?)

Thu Mar 6 18:16:23 CET 2008

On 07/03/2008, Brian Smith <brian at briansmith.org> wrote:
> Manlio Perillo wrote:
>  > Brian Smith ha scritto:
>  > > Manlio Perillo wrote:
>  > >> Fine with me but there is a *big* problem.
>  > >>
>  > >> WSGI 2.0 "breaks" support for asynchronous applications (since you
>  > >> can no more send headers in the app iter).
>  > >
>  > > WSGI 1.0 doesn't guarentee that all asynchronous applications will
>  > > work either, because it allows the WSGI gateway to wait for
>  > and buffer
>  > > all the input from the client before even calling the
>  > application callable.
>  > > And, it doesn't provide a way to read an indefinite stream of input
>  > > from the client, which is also problematic.
>  > >
>  > > Anyway, please post a small example of a program that fails to work
>  > > because of these proposed changes for WSGI 2.0.
>  > >
>  > > Thanks,
>  > > Brian
>  > >
>  >
>  >
>  > Attached there are two working examples (I have not committed
>  > it yet, because I'm still testing - there are some problems
>  > that I need to solve).
>
>
> I looked at your examples and now I understand better what you are
>  trying to do. I think what you are trying to do is reasonable but it
>  isn't something that is supported even by WSGI 1.0. It happens to work
>  efficiently for your particular gateway, but that isn't what WSGI is
>  about. In fact, any WSGI application that doesn't run correctly with an
>  arbitrary WSGI gateway (assuming no bugs in any gateway) isn't a WSGI
>  application at all.
>
>  It seems that the problem with your examples is not that they won't work
>  with WSGI 2.0. Rather, the problem is that the applications block too
>  long. The application will still work correctly, but will not be
>  efficient when run in nginx's mod_wsgi. However, that isn't a problem
>  with the specification or with the application; it is a problem with
>  nginx's mod_wsgi. I hate reading about the "Pythonic way" of doing
>  things, but writing a WSGI application so that it doesn't block too much
>  or too long is simply not Pythonic. The WSGI gateway needs to abstract
>  away those concerns so that they aren't an issue. Otherwise, the gateway
>  will only be useful for specialized applications designed to run well on
>  that particular gateway. Such specialized applications might as well use
>  specialized (gateway-specific) APIs, if they have to be designed
>  specifically for a particular gateway anyway.
>
>  Further, it is impossible to write a good HTTP proxy with WSGI. The
>  control over threading, blocking, I/O, and buffer management is just not
>  there in WSGI. In order to support efficient implementations of such
>  things, WSGI would have to become so low-level that it would become
>  pointless--it would be exposing an interface that is so low-level that
>  it wouldn't even be cross-platform. It wouldn't abstract away anything.
>
>  At the same time, the current WSGI 2.0 proposal abstracts too much. It
>  is good for applications that are written directly on top of the
>  gateway, and for simple middleware. But, it is not appropriate for a
>  serious framework to be built on. It is wrong to think that the same
>  interface is suitable for frameworks, middleware developers, and
>  application developers. I would rather see WSGI 2.0 become a much
>  lower-level framework that works at the buffer level (not strings), with
>  the ability to do non-blocking reads from wsgi.input, and the ability to
>  let the WSGI gateway do buffering in a sane and efficient manner
>  (there's no reason for the application to do a bunch of string joins
>  when the gateway could just send all the pieces in a single writev()).
>  Some control over blocking, HTTP chunked encoding, etc. could be
>  included as well. The current suggestions for WSGI 2.0 would then just
>  be a sample framework layered on top of this low-level interface, for
>  developers that don't want to use a big framework like DJango or Pylons.
>  But, the big frameworks and middleware would use the low-level interface
>  to run efficiently.

In part adding to what Brian is saying, you (Manlio) speak as if WSGI
2.0 is already somehow set in stone and because you can't do what you
want, then it is no good and we should keep the WSGI 1.0 way of doing
things.

Like Brian is starting to think about what else WSGI 2.0 could be so
as to allow other ways of doing things, why don't you try the same
thing and think about how you could do what you want in a similar
style to WSGI 2.0, but adapting the WSGI 2.0 interface in some way. If
the changes make sense and don't deviate too far from where we have
been going, maybe people might accept it.

This following idea may not make much sense, but baby keeping me up,
its 4am and I am probably not going to get back to sleep until I get
this idea out of my head now.

Anyway, WSGI 2.0 currently talks about returning a single tuple
containing status, headers and iterable. What if it actually
optionally allowed the response to itself be an iterable, such that
you could do:

  yield ('102 Processing', [], None)
  ...
  yield ('102 Processing', [], None)
  ...
  yield ('200 OK', [...], [...])

I'll admit that I am not totally across what the HTTP 102 status code
is meant to be used for and am sort of presuming that this might make
sense. Am sure though that Brian who understands this sort of level
better than me will set me straight.

That said, could the return of 102 like this allow the same result as
what you are currently doing with yielding empty strings prior to
setting up headers?

Going a bit further with this, would it make sense for an application
to also be able to return a 100 to force server layer to tell client
to start sending data if 100-continue expect header sent.

Could it also be used in some way to allow better control over output
chunking by allowing:

  yield ('200 OK', [...], [...])
  ...
  yield (None, None, [...])

In other words the application could effectively yield up multiple
iterables related to the actual response content.

Not that all HTTP servers support it, could this be a way of allowing
an application when using output chunking to specify trailer headers
for after last response content chunk.

  yield ('200 OK', [...], [...])
  ...
  yield (None, None, [...])
  ...
  yield (None, [...], None)

Important thing though is I am not suggesting this be the default way
of doing responses, but that be an optionally available lower level
layer for doing it. An application could still just return a single
tuple as per WSGI 2.0 now. A good server adapter may optionally also
allow this more low level interface which allows some better measure
of control. Support of this low level interface could be optional,
with WSGI environment used to indicate if server supports it or not.

Now, this doesn't deal with request content and an alternative to
current wsgi.input so that one could do the non blocking read to get
back just what was available, ie. next chunk, but surely we can come
up with solutions for that as well. Thus I don't see it as impossible
to also handle input chunked content as well. We just need to stop
thinking that what has been proposed for WSGI 2.0 so far is the full
and complete interface.

Okay, I feel I can go back to sleep now. You can all start laughing
now if this insomnia driven idea is plain stupid. :-)

Graham