[Web-SIG] ngx.poll extension (was Re: Are you going to convert Pylons code into Python 3000?)

Thu Mar 6 22:06:29 CET 2008

Brian Smith ha scritto:
 >
> [...] 
> 
> That idea doesn't really benefit Manlio's programs. Manlio's program is
> trying to say "use my thread for some other processing until some
> (external) event happens." 

Right.

> We already have standard mechanisms for doing
> something similar in WSGI: multi-threaded and multi-process WSGI
> gateways that let applications block indefinitely while letting other
> applications run. 

Ok, but this is not the best solution to the problem!

> A polling interface like Manlio proposes does help for
> applications that are doing I/O via TCP(-like) protocols. 

This is true only on Windows.

> But, it
> doesn't provide a way to wait for a database query to finish, or for any
> other kind of IPC to complete, unless everything is rebuilt around that
> polling mechanism. 

This is not generally true.

> It isn't a general enough interface to become a part
> of WSGI. 

I'm not proposing it to become part of WSGI (since it should not be stay 
here), but part of the wsgiorg "namespace", or an officially 
asynchronous extensions interface.

> I think it is safe to say that multi-threaded or multi-process
> execution is something that is virtually required for WSGI.
> 

but only if the application is synchronous and heavy I/O bound.

Note that Nginx is multi-process, but it only executes a fixed number of 
worker processes, so if an I/O request can block for a significative 
amount of time, you can not afford to let it block.

Moreover with an asynchronous gateway it is possible to implement a 
"middleware" that can execute an application inside a thread.

This is possible by creating a pipe, starting a new thread, having the 
main thread polling the pipe, and having the thread write some data in 
the pipe to "wake" the main thread when finish its job.

I'm going to write a sample implementation when I find some time.

Yes, we need to use a thread, but this can be done in pure Python code 
only (altought I'm not sure if this can have side effects on Nginx).

> [...]
 >
> Again, I like the simplification that WSGI 2.0 applications are always
> functions or function-like callables, and never iterables. 

Where is the simplification?

> 
> The trailers feature is something I haven't thought a lot about. Again,
> that is something that CGI doesn't support (I don't think FastCGI
> supports it either). So, that is something that has to also be done in a
> way similar to the above:
> 
>    def application(environ, start_response):
>       headers = [...]
>       trailers = environ.get("wsgi.trailers")
>       if trailers is None:
>           # inefficiently calculate the trailer fields
>           # in advance
>           headers.append("Header-A", ...)
>           headers.append("Header-B", ...)
>       ...
>       start_response("200 OK", headers)      
>       ...
>       while ...:
>         if trailers is not None:
>           # calculate trailer fields as we yield
>           # output
>         yield output
> 
>       trailers.append("Header-A", ...)
>       trailers.append("Header-B", ...)
> 
> It would be nice of the specification for the trailers extension
> specified that the trailers list is included in the WSGI environment if
> and only if (1) we are talking HTTP/1.1, and (2) the gateway and web
> server support trailers.
> 

This is an interesting idea.

Unfortunately right now Nginx does not supports trailing headers, and I 
don't know if common browsers support them.

 > [...]

>> Now, this doesn't deal with request content and an 
>> alternative to current wsgi.input so that one could do the 
>> non blocking read to get back just what was available, ie. 
>> next chunk, but surely we can come up with solutions for that 
>> as well. Thus I don't see it as impossible to also handle 
>> input chunked content as well. We just need to stop thinking 
>> that what has been proposed for WSGI 2.0 so far is the full 
>> and complete interface.
> 
> We can just say that WSGI-2.0-style applications must support chunked
> request bodies, but gateways are not required to support them.
> WSGi-2.0-style applications would have to check for CONTENT_LENGTH, and
> if that is missing, check to see if environ['HTTP_TRANSFER_ENCODING']
> includes the "chunked" token. wsgi_input.read() would have to stop at
> the end of the request; applications would not restricted from
> attempting to read more than CONTENT_LENGTH bytes.
> 
> WSGI gateways would have to support an additional (keyword?) argument to
> wsgi.input.read() that controls whether it is blocking or non-blocking.
> It seems pretty simple.
> 

How should be written an application to use this feature?

> Notice that all of this can be done even with WSGI 1.0, if these
> additional features were broken out into their own PEP(s). 
> 
> - Brian
> 

Manlio Perillo