From electronixtar at gmail.com Sat Apr 27 06:36:29 2013 From: electronixtar at gmail.com (est) Date: Sat, 27 Apr 2013 12:36:29 +0800 Subject: [Web-SIG] [python-tulip] Re: [Python-Dev] wsgi validator with asynchronous handlers/servers In-Reply-To: References: Message-ID: Hi, Newbie opinion here. Since we are talking about Tulip and PEP 3156, I think it's high time we address some of the design flaws in WSGI 1.0 One major problem with WSGI is that it can not handle true post-response hooks. The closest hack I found is this: https://modwsgi.readthedocs.org/en/latest/developer-guides/registering-cleanup-code.html As discussed by Graham Dumpleton here https://groups.google.com/group/modwsgi/msg/d699a09b3b11b313 Although the response was returned to the client, It will still hold the http connection open until __callback finishes. While it's pretty common design pattern for a post-response hook in modern Web world. I can think a few usage: - User uploads file, return HTML says Upload OK, then Web worker continue to transfer file to Amazon S3, which is slow and takes some time. - After a series of user interaction on a web page, using the existing db connection to write OLAP logs of later analysis. - notify the http request to another ZMQ/XMPP connection Currently, Celery is extremely popular (at least in Django or other non-async web frameworks). But IMHO it's too heavy weight and copying python data & objects from a cluster of Web workers to another cluster of task queue workers is not worth it. Another problem is the good old CGI environ design. I can't help to ask? Why? Every HTTP header is transfered via envion, and capitalized with a HTTP_ prefix e.g. HTTP_HOST. There's some serious information loss here. 1. Actual header string case 2. header order Since WSGI is higher level framework, I think it's time for us to deliver the original header status in a SortedDict. Again, as a newbie advice, we should take this chance of integrating PEP 3156 with a deadly simple WSGI 3.0 design: def application(request): ip = request.remote_ip length = request.headers["Content-Length"] request.write("done.") request.close() db.log(length) # some post-response actions. On Mon, Mar 25, 2013 at 9:08 AM, Guido van Rossum wrote: > Hi Luca, > > Unfortunately I haven't thought yet about the interactions between WSGI > and Tulip or PEP 3156. While I am pretty familiar with WSGI, I have never > used its async features, so I can't be much of a help. My best guess is > that we won't make any changes to WSGI to support PEP 3156 in Python 3.4, > but that once that is out, some folks will come up with an improved design > for WSGI that supports interoperability with standard async event loops. > OTOH, maybe you can read up on the PEP and check out the Tulip > implementation (http://code.google.com/p/tulip/) and maybe you can come > up with a suitable design for integrating PEP 3156 into WSGI? Though it may > have to be named WSGI 2.0 to emphasize that it is backwards incompatible. > > --Guido > > > > On Sun, Mar 24, 2013 at 2:18 PM, Luca Sbardella wrote: > >> Hello, >> >> first time here, I'm Luca and I write lots of python of the asynchronous >> variety. >> This question is about wsgi and the way pulsar >> http://quantmind.github.com/pulsar/ handles asynchronous wsgi responses. >> >> Yesterday I sent a message to the python-dev mailing list regarding >> wsgiref.validator, this is the original message >> >> I have an asynchronous wsgi application handler which yields empty bytes >> before it is ready to yield the response body and, importantly, to call >> start_response. >> >> Something like this: >> >> def wsgi_handler(environ, start_response): >> body = generate_body(environ) >> body = maybe_async(body) >> while is_async(body): >> yield b'' >> start_response(...) >> ... >> >> I started using wsgiref.validator recently, nice little gem in the >> standard lib, and I discovered that the above handler does not validate! >> Disaster. >> Reading pep 3333 >> >> "the application *must* invoke the start_response() callable before the >> iterable yields its first body bytestring, so that the server can send the >> headers before any body content. However, this invocation *may* be >> performed by the iterable's first iteration, so servers *must not* assume >> that start_response() has been called before they begin iterating over >> the iterable." >> >> The pseudocode above does yields bytes before start_response, but they >> are not *body* bytes, they are empty bytes so that the asynchronous wsgi >> server releases the eventloop and call back at the next eventloop iteration. >> >> >> And the response was >> >> >> >PJ Eby wrote: >>> >> The validator is correct for the spec. You *must* call >>> >> start_response() before yielding any strings at all. >>> > >>> > >>> > Thanks for response PJ, >>> > that is what I, unfortunately, didn't want to hear, the validator being >>> > correct for the "spec" means I can't use it for my asynchronous stuff, >>> which >>> > is a shame :-((( >>> > But why commit to send headers when you may not know about your >>> response? >>> > Sorry if this is the wrong mailing list for the issue, I'll adjust as >>> I go >>> > along. >>> >>> Because async was added as an afterthought to WSGI about nine years >>> ago, and we didn't get it right, but it long ago was too late to do >>> anything about it. A properly async WSGI implementation will probably >>> have to wait for Tulip (Guido's project to bring a standard async >>> programming API to Python). >>> >> >> and so here I am. >> I know tulip is on its early stages but is there anything on the pipeline >> about wsgi? >> Happy to help if needed. >> >> Regards >> Luca >> >> > > > -- > --Guido van Rossum (python.org/~guido) > > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: > http://mail.python.org/mailman/options/web-sig/electronixtar%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From graham.dumpleton at gmail.com Sat Apr 27 07:24:33 2013 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Sat, 27 Apr 2013 15:24:33 +1000 Subject: [Web-SIG] [python-tulip] Re: [Python-Dev] wsgi validator with asynchronous handlers/servers In-Reply-To: References: Message-ID: <588712B5-E980-421E-90E4-6D4FF6B20000@gmail.com> I described a different way of doing WSGI which would better cope with post response hooks at the Python Web Summit at PyCon in 2012. It made use of the context manager abstraction so it wouldn't screw with the returned iterable. http://www.slideshare.net/GrahamDumpleton/pycon-us-2012-state-of-wsgi-2-14808297 Graham On 27/04/2013, at 2:36 PM, est wrote: > Hi, > > Newbie opinion here. > > Since we are talking about Tulip and PEP 3156, I think it's high time we address some of the design flaws in WSGI 1.0 > > One major problem with WSGI is that it can not handle true post-response hooks. > > The closest hack I found is this: > https://modwsgi.readthedocs.org/en/latest/developer-guides/registering-cleanup-code.html > > > As discussed by Graham Dumpleton here > https://groups.google.com/group/modwsgi/msg/d699a09b3b11b313 > > Although the response was returned to the client, It will still hold the http connection open until __callback finishes. > > While it's pretty common design pattern for a post-response hook in modern Web world. I can think a few usage: > > - User uploads file, return HTML says Upload OK, then Web worker continue to transfer file to Amazon S3, which is slow and takes some time. > - After a series of user interaction on a web page, using the existing db connection to write OLAP logs of later analysis. > - notify the http request to another ZMQ/XMPP connection > > Currently, Celery is extremely popular (at least in Django or other non-async web frameworks). But IMHO it's too heavy weight and copying python data & objects from a cluster of Web workers to another cluster of task queue workers is not worth it. > > Another problem is the good old CGI environ design. I can't help to ask? Why? > > Every HTTP header is transfered via envion, and capitalized with a HTTP_ prefix e.g. HTTP_HOST. There's some serious information loss here. > > 1. Actual header string case > 2. header order > > Since WSGI is higher level framework, I think it's time for us to deliver the original header status in a SortedDict. > > Again, as a newbie advice, we should take this chance of integrating PEP 3156 with a deadly simple WSGI 3.0 design: > > def application(request): > ip = request.remote_ip > length = request.headers["Content-Length"] > request.write("done.") > request.close() > db.log(length) # some post-response actions. > > > > On Mon, Mar 25, 2013 at 9:08 AM, Guido van Rossum wrote: > Hi Luca, > > Unfortunately I haven't thought yet about the interactions between WSGI and Tulip or PEP 3156. While I am pretty familiar with WSGI, I have never used its async features, so I can't be much of a help. My best guess is that we won't make any changes to WSGI to support PEP 3156 in Python 3.4, but that once that is out, some folks will come up with an improved design for WSGI that supports interoperability with standard async event loops. OTOH, maybe you can read up on the PEP and check out the Tulip implementation (http://code.google.com/p/tulip/) and maybe you can come up with a suitable design for integrating PEP 3156 into WSGI? Though it may have to be named WSGI 2.0 to emphasize that it is backwards incompatible. > > --Guido > > > > On Sun, Mar 24, 2013 at 2:18 PM, Luca Sbardella wrote: > Hello, > > first time here, I'm Luca and I write lots of python of the asynchronous variety. > This question is about wsgi and the way pulsar http://quantmind.github.com/pulsar/ handles asynchronous wsgi responses. > > Yesterday I sent a message to the python-dev mailing list regarding wsgiref.validator, this is the original message > > I have an asynchronous wsgi application handler which yields empty bytes before it is ready to yield the response body and, importantly, to call start_response. > > Something like this: > > def wsgi_handler(environ, start_response): > body = generate_body(environ) > body = maybe_async(body) > while is_async(body): > yield b'' > start_response(...) > ... > > I started using wsgiref.validator recently, nice little gem in the standard lib, and I discovered that the above handler does not validate! Disaster. > Reading pep 3333 > > "the application must invoke the start_response() callable before the iterable yields its first body bytestring, so that the server can send the headers before any body content. However, this invocation may be performed by the iterable's first iteration, so servers must not assume that start_response() has been called before they begin iterating over the iterable." > > The pseudocode above does yields bytes before start_response, but they are not *body* bytes, they are empty bytes so that the asynchronous wsgi server releases the eventloop and call back at the next eventloop iteration. > > > And the response was > > > >PJ Eby wrote: > >> The validator is correct for the spec. You *must* call > >> start_response() before yielding any strings at all. > > > > > > Thanks for response PJ, > > that is what I, unfortunately, didn't want to hear, the validator being > > correct for the "spec" means I can't use it for my asynchronous stuff, which > > is a shame :-((( > > But why commit to send headers when you may not know about your response? > > Sorry if this is the wrong mailing list for the issue, I'll adjust as I go > > along. > > Because async was added as an afterthought to WSGI about nine years > ago, and we didn't get it right, but it long ago was too late to do > anything about it. A properly async WSGI implementation will probably > have to wait for Tulip (Guido's project to bring a standard async > programming API to Python). > > and so here I am. > I know tulip is on its early stages but is there anything on the pipeline about wsgi? > Happy to help if needed. > > Regards > Luca > > > > > -- > --Guido van Rossum (python.org/~guido) > > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: http://mail.python.org/mailman/options/web-sig/electronixtar%40gmail.com > > > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From pje at telecommunity.com Sun Apr 28 00:21:31 2013 From: pje at telecommunity.com (PJ Eby) Date: Sat, 27 Apr 2013 18:21:31 -0400 Subject: [Web-SIG] [python-tulip] Re: [Python-Dev] wsgi validator with asynchronous handlers/servers In-Reply-To: <588712B5-E980-421E-90E4-6D4FF6B20000@gmail.com> References: <588712B5-E980-421E-90E4-6D4FF6B20000@gmail.com> Message-ID: On Sat, Apr 27, 2013 at 1:24 AM, Graham Dumpleton wrote: > I described a different way of doing WSGI which would better cope with post > response hooks at the Python Web Summit at PyCon in 2012. It made use of the > context manager abstraction so it wouldn't screw with the returned iterable. > > http://www.slideshare.net/GrahamDumpleton/pycon-us-2012-state-of-wsgi-2-14808297 Also, wsgi_lite provides a way of registering resources to be closed post-response, that works within WSGI 1.0, also without altering the returned iterable: https://bitbucket.org/pje/wsgi_lite#close-and-resource-cleanups Although wsgi_lite provides programmatic support for this, it's internally implemented as a stock WSGI extension key ('wsgi_lite.closing') in the environ, and can be offered today by servers or middleware in a 1.0 environment. I just haven't gotten around to knocking out a PEP for it.