[Web-SIG] PEP 444 / WSGI 2 Async

Fri Jan 7 09:39:28 CET 2011

On 2011-01-06 20:49:57 -0800, P.J. Eby said:
> It would be helpful if you addressed the issue of scope, i.e., 
> whatfeatures are you proposing to offer to the application developer.

Conformity, predictability, and portability.  That's a lot of y's.  
(Pardon the pun!)

Alex Grönholm's post describes the goal quite clearly.

> So far, I believe you're the second major proponent (i.e. ones with 
> concrete proposals and/or implementations to discuss) of an async 
> protocol... and what you have in common with the other proponent is 
> that you happen to have written an async server that would benefit from 
> having apps operating asynchronously.  ;-)

Well, the Marrow HTTPd does operate in multi-process mode, and, one 
day, multi-threaded or a combination.  Integration of a futures 
executor to the WSGI environment would alleviate the major need for a 
multi-threaded implementation in the server core; intensive tasks can 
be deferred to a thread pool vs. everything being deferred to a thread 
pool.  (E.g. template generation, PDF/other text extraction for 
indexing of file uploads, image scaling, etc. all of which are real use 
cases I have which would benefit from futures.)

> I find it hard to imagine an app developer wanting to do something 
> asynchronously for which they would not want to use one of the big-dog 
> asynchronous frameworks.  (Especially if their app involves database 
> access, or other communications protocols.)

Admittedly, a truly async server needs some way to allow file 
descriptors to be registered with the reactor core, with the WSGI 
application being resumed upon some event (e.g. socket is readable or 
writeable for DB access, or even pipe operations for use cases I can't 
think of at the moment).

Futures integration is a Good Idea, IMHO, and being optional and easily 
added to the environ by middleware for servers that don't implement it 
natively is even better.

As for how to provide a generic interface to an async core, I have two 
ideas, but one is magical and the other is more so; I'll describe these 
in a descrete post.

> This doesn't mean I think having a futures API is a bad thing, butISTM 
> that a futures extension to WSGI 1 could be defined right nowusing an 
> x-wsgi-org extension in that case...  and you could thenfind out how 
> many people are actually interested in using it.

I'll add writing up a WSGI middleware layer that configures and adds a 
future.executor to the environ to my already overweight to-do list.  It 
actually is something I have a use for right now on at least one 
commercial project.  :)

> Mainly, though, what I see is people using the futures thing to shuffle 
> off compute-intensive tasks...

That's what it's for.  ;)

> ...but if they do that, then they're basically trying to make the 
> server's life easier...  but under the existing spec, any truly async 
> server implementing WSGI is going to run the *app* in a "future" of 
> some sort already...

Running the application in a future is actually not a half-bad way for 
me to add threading to marrow.server... thanks!

> Which means that the net result is that putting in async is like saying 
> to the app developer: "hey, you know this thing that you just could do 
> in WSGI 1 and the server would take care of it foryou?  Well, now you 
> can manage that complexity by yourself!  Isn't that wonderful?"   ;-)

That's a bit extreme; PEP 444 servers may still implement threading, 
multi-processing, etc. at the reactor level (a la CherryPy or Paste).  
Giving WSGI applications access to a futures executor (possibly the one 
powering the main processing threads) simply gives applications the 
ability to utilize it, not the requirement to do so.

> I could be wrong of course, but I'd like to see what concrete usecases 
> people have for async.

Earlier in this post I illustrated a few that directly apply to a 
commercial application I am currently writing.  I'll elaborate:

:: Image scaling would benefit from multi-processing (spreading the 
load across cores). Also, only one sacle is immediately required before 
returning the post-upload page: the thumbnail.  The other scales can be 
executed without halting the WSGI application's return.

:: Asset content extraction and indexing would benefit from threading, 
and would also not require pausing the WSGI application.

:: Since most templating engines aren't streaming (see my unanswered 
thread in the general mailing list re: this), pausing the application 
pending a particularly difficult render is a boon to single-threaded 
async servers, though true streaming templating (with flush semantics) 
would be the holy grail.  ;)

:: Long-duration calls to non-async-aware libraries such as DB access.  
The WSGI application could queue up a number of long DB queries, pass 
the futures instances to the template, and the template could then 
.result() (block) across them or yield them to be suspended and resumed 
when the result is available.

:: True async is useful for WebSockets, which seem a far superior 
solution to JSON/AJAX polling in addition to allowing real web-based 
socket access, of course.

> We dropped the first discussion of async six years ago because someone 
> (I think it might've been James)pointed out that, well, it isn't 
> actually that useful.  And every subsequent call for use cases since 
> has been answered with, "well, the use case is that you want it to be 
> async."

See the above.  ;)

	- Alice.