From pje at telecommunity.com Tue Mar 19 18:47:03 2013 From: pje at telecommunity.com (PJ Eby) Date: Tue, 19 Mar 2013 13:47:03 -0400 Subject: [Web-SIG] WSGI Lite In-Reply-To: <6983615E-47DB-492B-83BE-634E4A93E79A@me.com> References: <201303181055.r2IAtti6008323@vision.dirtsimple.org> <6983615E-47DB-492B-83BE-634E4A93E79A@me.com> Message-ID: On Mon, Mar 18, 2013 at 1:08 PM, Simon Yarde wrote: > If I understand correctly, one of the goals of WSGI Lite is to keep the environ out of middleware and app code to prevent modification. No. You are *allowed* to modify the environment, that's part of the WSGI spec. What you *can't* do is trust that nobody *else* will modify it. Which is why you can't use the environment to communicate with middleware, only objects passed along in the environment. For example, if middleware does env['foobar']=[], it's okay for a called app to modify that list, as long as the middleware saves a reference to it and doesn't try to pull it out of the environ later, e.g.: def middleware(...): my_list = [] env['foobar'] = my_list app(env) if my_list: ....blah But you must NEVER do this: def middleware(...): app(env) if env['foobar']: ....blah Even if you were the one who put 'foobar' into the env. WSGI Lite's argument binding protocol addresses a different problem, which is that after you call app(env), it's too late for you to get anything you need out of the original environment, because app() was within its rights to clear env or change its contents in any way. (It also makes it easier to create application-specific or framework-specific calling conventions, like if you want your controller functions to be called with a user and a cart as parameters, as long as you can define how to get a user and a cart from a WSGI environment.) > Could this particular goal be achieved by the middleware creating references to the environ values it depends upon prior to calling the app? Yes, you can use that to communicate up the middleware chain, as I showed above. But the problem WSGI Lite bindings solve is not really related to that. > Regarding my own scenario.. > > I have been working on my own small framework, partly as a learning exercise and partly out of frustrations with existing frameworks. > > I return 3-tuple responses as per WSGI Lite, and my middleware generates minimal error responses rather than raise exceptions, e.g. ('404 Not Found', ['Content-Length': '0'], []). > > I use status-handler decorators to customise these basic responses; these operate at the level of the individual apps or across a dispatch app to provide global response customising. > > I use a 'final' flag in the environ to indicate to any outer status-handlers that the response is intended as definitive, i.e. environ[my_framework.status_handler.final] = True, and should not be altered again. Don't do that. You need to put a callback in the environ, or a mutable object that you keep a reference to. The environ dictionary itself is strictly passed down to child handlers, and it's perfectly valid for a piece of middleware to clear the environ entirely before returning to its caller. So you can't use raw values in the environ to communicate up the chain, only down. Of course, even if you use another method to communicate this "final" flag, that doesn't necessarily mean that your "final" flag makes any sense in a WSGI context. It might actually be that you need to use a custom header like 'X-MyFramework-Final', that your boundary middleware strips. That is probably actually the right way to do it in WSGI, because there's no guarantee a "final" response returned from one app is actually going to be the same response as one returned from another piece of middleware wrapping that. So really, since this is information about the response, you should put it in a response header. (Perhaps in a future version of WSGI Lite, I could extend the response protocol to support stripping out some special response headers at the protocol boundary between WSGI 1 and WSGI Lite.) > Under WSGI Lite, would I still be able to add flags to the environ? Or is there some other way I should be signalling to outer middleware? Callbacks and mutable objects are the only way authorized by the WSGI spec to communicate from an app to an outer server or middleware (aside from response headers, bodies, or special response iterators), and WSGI Lite doesn't change this. It just makes it easier to work with values obtained from the environment. From pje at telecommunity.com Fri Mar 22 20:02:00 2013 From: pje at telecommunity.com (PJ Eby) Date: Fri, 22 Mar 2013 15:02:00 -0400 Subject: [Web-SIG] WSGI Lite In-Reply-To: References: <201303181055.r2IAtti6008323@vision.dirtsimple.org> <6983615E-47DB-492B-83BE-634E4A93E79A@me.com> Message-ID: [Please follow-up to web-sig, rather than emailing me privately. Thanks.] On Fri, Mar 22, 2013 at 6:52 AM, Simon Yarde wrote: > If I have two layers of middleware, can I trust the intermediate layer has > not altered or wiped out any dotted-name keys in the environ prior to > calling the layer below? No; middleware isn't even required to pass the same environ object to a nested app. But most middleware will, and won't delete things in it. > I discovered your article on WSGI Lite as I was having similar ideas about > binding for the purposes you describe; auth, session, cart etc. and keeping > the app code minimal. I was aiming to find a way to allow objects > initialised with an environ to interrupt execution at initialisation time > and return a response. I felt something like this was needed to avoid doing > this at the top of every app: > > def app(input=Parse(XML, JSON), > auth=Auth): > if not input: > return NotAcceptableResponse > if not auth: > return UnAuthorisedResponse > > I was playing around with a factory-method approach that would either return > an instance or a 3-tuple response; the binding process would identify the > result as an instance and add it to the calling args, or interrupt the call > and return it. Raising an exception to be caught would probably be preferable. It might be that the Lite protocol should add a way to convert an exception to a response, e.g. by looking for a __wsgi_response__ attribute on it. That way, you could raise anything you wanted as a default response, and the error would convert to a response at the WSGI 1/Lite boundary. This would mostly be suitable for app-specific errors, but of course you could put error-handling middleware anywhere in the stack below the boundary, or formatting middleware above the boundary. So, thus far, possible extensions to the Lite protocol would be: * Exception to response conversion * A standard for stripping custom HTTP headers I don't have any idea as to when I might get around to these, but if somebody wants to create a model patch or two, that'd be cool. ;-) The protocol of course is looking less and less "lite" with these additions, but I suppose if one looks at "lite" as actually being a collection of "microprotocols" it's not bad at all. Everything in Lite is essentially orthogonal right now (and would continue to be with these additions), so it's nothing like the obscenity of complexity and legacies that is WSGI's core. ;-) From pje at telecommunity.com Sun Mar 24 06:14:10 2013 From: pje at telecommunity.com (PJ Eby) Date: Sun, 24 Mar 2013 01:14:10 -0400 Subject: [Web-SIG] [Python-Dev] wsgi validator with asynchronous handlers/servers In-Reply-To: References: Message-ID: On Sat, Mar 23, 2013 at 7:30 PM, Luca Sbardella wrote: >PJ Eby wrote: >> The validator is correct for the spec. You *must* call >> start_response() before yielding any strings at all. > > > Thanks for response PJ, > that is what I, unfortunately, didn't want to hear, the validator being > correct for the "spec" means I can't use it for my asynchronous stuff, which > is a shame :-((( > But why commit to send headers when you may not know about your response? > Sorry if this is the wrong mailing list for the issue, I'll adjust as I go > along. Because async was added as an afterthought to WSGI about nine years ago, and we didn't get it right, but it long ago was too late to do anything about it. A properly async WSGI implementation will probably have to wait for Tulip (Guido's project to bring a standard async programming API to Python). From guido at python.org Mon Mar 25 02:08:04 2013 From: guido at python.org (Guido van Rossum) Date: Sun, 24 Mar 2013 18:08:04 -0700 Subject: [Web-SIG] [python-tulip] Re: [Python-Dev] wsgi validator with asynchronous handlers/servers In-Reply-To: References: Message-ID: Hi Luca, Unfortunately I haven't thought yet about the interactions between WSGI and Tulip or PEP 3156. While I am pretty familiar with WSGI, I have never used its async features, so I can't be much of a help. My best guess is that we won't make any changes to WSGI to support PEP 3156 in Python 3.4, but that once that is out, some folks will come up with an improved design for WSGI that supports interoperability with standard async event loops. OTOH, maybe you can read up on the PEP and check out the Tulip implementation ( http://code.google.com/p/tulip/) and maybe you can come up with a suitable design for integrating PEP 3156 into WSGI? Though it may have to be named WSGI 2.0 to emphasize that it is backwards incompatible. --Guido On Sun, Mar 24, 2013 at 2:18 PM, Luca Sbardella wrote: > Hello, > > first time here, I'm Luca and I write lots of python of the asynchronous > variety. > This question is about wsgi and the way pulsar > http://quantmind.github.com/pulsar/ handles asynchronous wsgi responses. > > Yesterday I sent a message to the python-dev mailing list regarding > wsgiref.validator, this is the original message > > I have an asynchronous wsgi application handler which yields empty bytes > before it is ready to yield the response body and, importantly, to call > start_response. > > Something like this: > > def wsgi_handler(environ, start_response): > body = generate_body(environ) > body = maybe_async(body) > while is_async(body): > yield b'' > start_response(...) > ... > > I started using wsgiref.validator recently, nice little gem in the > standard lib, and I discovered that the above handler does not validate! > Disaster. > Reading pep 3333 > > "the application *must* invoke the start_response() callable before the > iterable yields its first body bytestring, so that the server can send the > headers before any body content. However, this invocation *may* be > performed by the iterable's first iteration, so servers *must not* assume > that start_response() has been called before they begin iterating over > the iterable." > > The pseudocode above does yields bytes before start_response, but they are > not *body* bytes, they are empty bytes so that the asynchronous wsgi server > releases the eventloop and call back at the next eventloop iteration. > > > And the response was > > > >PJ Eby wrote: >> >> The validator is correct for the spec. You *must* call >> >> start_response() before yielding any strings at all. >> > >> > >> > Thanks for response PJ, >> > that is what I, unfortunately, didn't want to hear, the validator being >> > correct for the "spec" means I can't use it for my asynchronous stuff, >> which >> > is a shame :-((( >> > But why commit to send headers when you may not know about your >> response? >> > Sorry if this is the wrong mailing list for the issue, I'll adjust as I >> go >> > along. >> >> Because async was added as an afterthought to WSGI about nine years >> ago, and we didn't get it right, but it long ago was too late to do >> anything about it. A properly async WSGI implementation will probably >> have to wait for Tulip (Guido's project to bring a standard async >> programming API to Python). >> > > and so here I am. > I know tulip is on its early stages but is there anything on the pipeline > about wsgi? > Happy to help if needed. > > Regards > Luca > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Mar 25 19:48:09 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 25 Mar 2013 11:48:09 -0700 Subject: [Web-SIG] [python-tulip] Re: [Python-Dev] wsgi validator with asynchronous handlers/servers In-Reply-To: References: Message-ID: Awesome! Can't wait to see that. On Mon, Mar 25, 2013 at 11:30 AM, Luca Sbardella wrote: > > maybe you can read up on the PEP and check out the Tulip implementation ( >> http://code.google.com/p/tulip/) and maybe you can come up with a >> suitable design for integrating PEP 3156 into WSGI? Though it may have to >> be named WSGI 2.0 to emphasize that it is backwards incompatible. >> >> > I have an idea already, > I'll write an initial implementation based on tulip.http.Response & > tulip.http.ServerHttpProtocol and I'll write a little example using it. > > Luca > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From manlio_perillo at libero.it Mon Mar 25 22:50:17 2013 From: manlio_perillo at libero.it (Manlio Perillo) Date: Mon, 25 Mar 2013 22:50:17 +0100 Subject: [Web-SIG] [Python-Dev] wsgi validator with asynchronous handlers/servers In-Reply-To: References: Message-ID: <5150C699.1050007@libero.it> Il 24/03/2013 06:14, PJ Eby ha scritto: > [...] >> Thanks for response PJ, >> that is what I, unfortunately, didn't want to hear, the validator being >> correct for the "spec" means I can't use it for my asynchronous stuff, which >> is a shame :-((( >> But why commit to send headers when you may not know about your response? >> Sorry if this is the wrong mailing list for the issue, I'll adjust as I go >> along. > > Because async was added as an afterthought to WSGI about nine years > ago, and we didn't get it right, but it long ago was too late to do > anything about it. A properly async WSGI implementation will probably > have to wait for Tulip (Guido's project to bring a standard async > programming API to Python). Do you really need a standard async programming API to design and implement an async WSGI specification? I think it is not needed. Some time ago I posted a sample implementation and documentation for a very simple async extension for WSGI: https://bitbucket.org/mperillo/txwsgi An interesting example about how an async API can be designed is PostgreSQL libpq, where the API expose a direct interface to the protocol state machine (pqConsumeInput), so you can not only use it with any async framework you like, but you can also use it in blocking mode. This, as far as I know, is impossible with the network protocol implementations in Twisted or other async frameworks. Regards Manlio