From manlio_perillo at libero.it Mon Oct 1 17:47:49 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Mon, 01 Oct 2007 17:47:49 +0200 Subject: [Web-SIG] hop-by-hop headers handling Message-ID: <470116A5.7010807@libero.it> Hi, I have another question with error handling. The WSGI spec only says that applications *must* not generate hop-by-hop headers, but says nothing on how a WSGI server should handle them. In the previous version of nginx mod_wsgi I just ignored these headers, but in the latest revisions, I raise an exception. Thanks Manlio Perillo From manlio_perillo at libero.it Tue Oct 2 21:30:46 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Tue, 02 Oct 2007 21:30:46 +0200 Subject: [Web-SIG] Multiple message-header fields handling Message-ID: <47029C66.5090408@libero.it> The HTTP 1.1 protocol (section 4.2) says that: """Multiple message-header fields with the same field-name MAY be present in a message if and only if the entire field-value for that header field is defined as a comma-separated list [i.e., #(values)].""" This can happen, as an example, with the Cookie header. My question is: how should this be handled in WSGI? As an example Nginx stores all the headers in a associative array, where, of course, only the "last seen" headers appears. However common multiple message-headers are stored in the request struct. Since the WSGI environment is a dictionary with keys and values of type str, should an implementation: """combine the multiple header fields into one "field-name: field-value" pair, without changing the semantics of the message, by appending each subsequent field-value to the first, each separated by a comma.""" ? Ngins does not do this (and I don't know what Apache does). Another question: when an header has an empty field value, what should be set in the environment: an empty string or None? Thanks Manlio Perillo From pje at telecommunity.com Tue Oct 2 21:44:05 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 02 Oct 2007 15:44:05 -0400 Subject: [Web-SIG] Multiple message-header fields handling In-Reply-To: <47029C66.5090408@libero.it> References: <47029C66.5090408@libero.it> Message-ID: <20071002194130.A118D3A407A@sparrow.telecommunity.com> At 09:30 PM 10/2/2007 +0200, Manlio Perillo wrote: >The HTTP 1.1 protocol (section 4.2) says that: >"""Multiple message-header fields with the same field-name MAY be >present in a message if and only if the entire field-value for that >header field is defined as a comma-separated list [i.e., #(values)].""" > >This can happen, as an example, with the Cookie header. > >My question is: how should this be handled in WSGI? > >As an example Nginx stores all the headers in a associative array, >where, of course, only the "last seen" headers appears. > >However common multiple message-headers are stored in the request struct. > >Since the WSGI environment is a dictionary with keys and values of type >str, should an implementation: >"""combine the multiple header fields into one "field-name: field-value" >pair, without changing the semantics of the message, by appending each >subsequent field-value to the first, each separated by a comma.""" >? If that's the only way to make the headers work, then the server may do so. >Another question: when an header has an empty field value, what should >be set in the environment: an empty string or None? If a value exists in the environ, it *must* be a string -- never None. And if the header exists, then a value should be in the environ. Therefore, it should be an empty string. From pje at telecommunity.com Tue Oct 2 21:45:29 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 02 Oct 2007 15:45:29 -0400 Subject: [Web-SIG] hop-by-hop headers handling In-Reply-To: <470116A5.7010807@libero.it> References: <470116A5.7010807@libero.it> Message-ID: <20071002194251.F019C3A407C@sparrow.telecommunity.com> At 05:47 PM 10/1/2007 +0200, Manlio Perillo wrote: >Hi, I have another question with error handling. > >The WSGI spec only says that applications *must* not generate hop-by-hop >headers, but says nothing on how a WSGI server should handle them. > >In the previous version of nginx mod_wsgi I just ignored these headers, >but in the latest revisions, I raise an exception. Raising an exception is indeed preferable. From alex at puddlejumper.foxybanana.com Tue Oct 2 21:50:21 2007 From: alex at puddlejumper.foxybanana.com (Alex Botero-Lowry) Date: Tue, 2 Oct 2007 12:50:21 -0700 Subject: [Web-SIG] Multiple message-header fields handling In-Reply-To: <47029C66.5090408@libero.it> References: <47029C66.5090408@libero.it> Message-ID: <20071002195021.GA6658@puddlejumper.foxybanana.com> On Tue, Oct 02, 2007 at 09:30:46PM +0200, Manlio Perillo wrote: > The HTTP 1.1 protocol (section 4.2) says that: > """Multiple message-header fields with the same field-name MAY be > present in a message if and only if the entire field-value for that > header field is defined as a comma-separated list [i.e., #(values)].""" > > This can happen, as an example, with the Cookie header. > > My question is: how should this be handled in WSGI? > > As an example Nginx stores all the headers in a associative array, > where, of course, only the "last seen" headers appears. > > However common multiple message-headers are stored in the request struct. > Initially I used such a solution (cookies was a special property in the response object), but I ended up just throwing together a custom dict that looks like: class ResponseHeaders(dict): def __setitem__(self, item, val): if item in self: iv = self[item] if isinstance(iv, list): iv.append(val) else: iv = [iv, val] dict.__setitem__(self, item, iv) else: dict.__setitem__(self, item, val) def replace(self, key, val): dict.__setitem__(self, key, val) def items(self): ret = [] for k,v in dict.items(self): if isinstance(v, list): ret.extend([ (k, a) for a in v ]) else: ret.append((k, v)) return ret def iteritems(self): return iter(self.items()) It's really intended for passing the headers on to start_response, and for getting the headers into it, rather then for reading from it, which is fine 99% of the time. I recently had to add replace since i had a situation where I needed to overwrite a preset header, but other than that it serves me well. Alex From fumanchu at aminus.org Tue Oct 2 21:47:57 2007 From: fumanchu at aminus.org (Robert Brewer) Date: Tue, 2 Oct 2007 12:47:57 -0700 Subject: [Web-SIG] Multiple message-header fields handling References: <47029C66.5090408@libero.it> Message-ID: Manlio Perillo wrote: > The HTTP 1.1 protocol (section 4.2) says that: > """Multiple message-header fields with the same field-name MAY be > present in a message if and only if the entire field-value for that > header field is defined as a comma-separated list [i.e., #(values)].""" > > This can happen, as an example, with the Cookie header. > > My question is: how should this be handled in WSGI? > > As an example Nginx stores all the headers in a associative array, > where, of course, only the "last seen" headers appears. > > However common multiple message-headers are stored in the request struct. > > Since the WSGI environment is a dictionary with keys and values of type > str, should an implementation: > """combine the multiple header fields into one "field-name: field-value" > pair, without changing the semantics of the message, by appending each > subsequent field-value to the first, each separated by a comma.""" > ? Yes, it should. As you note, it's part of the HTTP spec that such headers can be combined without changing the semantics. Here's a list of the headers that need to be folded: comma_separated_headers = ['ACCEPT', 'ACCEPT-CHARSET', 'ACCEPT-ENCODING', 'ACCEPT-LANGUAGE', 'ACCEPT-RANGES', 'ALLOW', 'CACHE-CONTROL', 'CONNECTION', 'CONTENT-ENCODING', 'CONTENT-LANGUAGE', 'EXPECT', 'IF-MATCH', 'IF-NONE-MATCH', 'PRAGMA', 'PROXY-AUTHENTICATE', 'TE', 'TRAILER', 'TRANSFER-ENCODING', 'UPGRADE', 'VARY', 'VIA', 'WARNING', 'WWW-AUTHENTICATE'] The only tricky one is Cookie, because e.g. Konqueror sends them on multiple lines, but they're not foldable. See http://kristol.org/cookie/errata.html > Ngins does not do this (and I don't know what Apache does). > > > Another question: when an header has an empty field value, what should > be set in the environment: an empty string or None? An empty string, or omit them entirely: """The following variables must be present, unless their value would be an empty string, in which case they may be omitted, except as otherwise noted below... HTTP_ Variables """. Robert Brewer fumanchu at aminus.org -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/web-sig/attachments/20071002/bebc9863/attachment.htm From manlio_perillo at libero.it Tue Oct 2 22:03:50 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Tue, 02 Oct 2007 22:03:50 +0200 Subject: [Web-SIG] Multiple message-header fields handling In-Reply-To: <47029C66.5090408@libero.it> References: <47029C66.5090408@libero.it> Message-ID: <4702A426.30109@libero.it> Manlio Perillo ha scritto: > [...] > As an example Nginx stores all the headers in a associative array, > where, of course, only the "last seen" headers appears. > A correction: Nginx stores "raw" headers in a list of key/value pairs, and not in an associative array. This means that when I iterate over the headers, I see all the multiple message-headers, but I only store the last header in the WSGI environment. > [...] Regards Manlio Perillo From manlio_perillo at libero.it Tue Oct 2 22:11:40 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Tue, 02 Oct 2007 22:11:40 +0200 Subject: [Web-SIG] Multiple message-header fields handling In-Reply-To: <20071002194130.A118D3A407A@sparrow.telecommunity.com> References: <47029C66.5090408@libero.it> <20071002194130.A118D3A407A@sparrow.telecommunity.com> Message-ID: <4702A5FC.8030000@libero.it> Phillip J. Eby ha scritto: > At 09:30 PM 10/2/2007 +0200, Manlio Perillo wrote: >> The HTTP 1.1 protocol (section 4.2) says that: >> """Multiple message-header fields with the same field-name MAY be >> present in a message if and only if the entire field-value for that >> header field is defined as a comma-separated list [i.e., #(values)].""" >> >> This can happen, as an example, with the Cookie header. >> >> My question is: how should this be handled in WSGI? >> >> As an example Nginx stores all the headers in a associative array, >> where, of course, only the "last seen" headers appears. >> >> However common multiple message-headers are stored in the request struct. >> >> Since the WSGI environment is a dictionary with keys and values of type >> str, should an implementation: >> """combine the multiple header fields into one "field-name: field-value" >> pair, without changing the semantics of the message, by appending each >> subsequent field-value to the first, each separated by a comma.""" >> ? > > If that's the only way to make the headers work, then the server may do so. > Nginx does not combine headers, so I have to do it by myself (and this will complicate the implementation)... However IMHO here you should not use the word "may", but "must", and this should be explicitly stated in the WSGI spec. > [...] Thanks and regards Manlio Perillo From manlio_perillo at libero.it Tue Oct 2 22:27:12 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Tue, 02 Oct 2007 22:27:12 +0200 Subject: [Web-SIG] Multiple message-header fields handling In-Reply-To: References: <47029C66.5090408@libero.it> Message-ID: <4702A9A0.2090005@libero.it> Robert Brewer ha scritto: > > [...] > As you note, it's part of the HTTP spec that such headers > can be combined without changing the semantics. Here's a list of the > headers that need to be folded: > > comma_separated_headers = ['ACCEPT', 'ACCEPT-CHARSET', 'ACCEPT-ENCODING', > 'ACCEPT-LANGUAGE', 'ACCEPT-RANGES', 'ALLOW', 'CACHE-CONTROL', > 'CONNECTION', 'CONTENT-ENCODING', 'CONTENT-LANGUAGE', 'EXPECT', > 'IF-MATCH', 'IF-NONE-MATCH', 'PRAGMA', 'PROXY-AUTHENTICATE', 'TE', > 'TRAILER', 'TRANSFER-ENCODING', 'UPGRADE', 'VARY', 'VIA', 'WARNING', > 'WWW-AUTHENTICATE'] > Note that some of these headers are response headers, and it is responsibility of the WSGI application to properly folding them, and not of the WSGI gateway. > The only tricky one is Cookie, because e.g. Konqueror sends them on > multiple lines, but they're not foldable. > > See http://kristol.org/cookie/errata.html > This is a mess... Note: in some tests, I have seen Firefox sending a Cookie on multiple lines. > [...] Thanks and regards Manlio Perillo From pje at telecommunity.com Tue Oct 2 22:36:47 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 02 Oct 2007 16:36:47 -0400 Subject: [Web-SIG] Multiple message-header fields handling In-Reply-To: <4702A426.30109@libero.it> References: <47029C66.5090408@libero.it> <4702A426.30109@libero.it> Message-ID: <20071002203514.115E03A407C@sparrow.telecommunity.com> At 10:03 PM 10/2/2007 +0200, Manlio Perillo wrote: >Manlio Perillo ha scritto: > > [...] > > As an example Nginx stores all the headers in a associative array, > > where, of course, only the "last seen" headers appears. > > > >A correction: Nginx stores "raw" headers in a list of key/value pairs, >and not in an associative array. > >This means that when I iterate over the headers, I see all the multiple >message-headers, but I only store the last header in the WSGI environment. That's definitely an error. From manlio_perillo at libero.it Tue Oct 2 23:01:33 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Tue, 02 Oct 2007 23:01:33 +0200 Subject: [Web-SIG] Multiple message-header fields handling In-Reply-To: <20071002203514.115E03A407C@sparrow.telecommunity.com> References: <47029C66.5090408@libero.it> <4702A426.30109@libero.it> <20071002203514.115E03A407C@sparrow.telecommunity.com> Message-ID: <4702B1AD.3090807@libero.it> Phillip J. Eby ha scritto: > At 10:03 PM 10/2/2007 +0200, Manlio Perillo wrote: >> Manlio Perillo ha scritto: >> > [...] >> > As an example Nginx stores all the headers in a associative array, >> > where, of course, only the "last seen" headers appears. >> > >> >> A correction: Nginx stores "raw" headers in a list of key/value pairs, >> and not in an associative array. >> >> This means that when I iterate over the headers, I see all the multiple >> message-headers, but I only store the last header in the WSGI >> environment. > > That's definitely an error. Right, its an error. A simple solution is to first check if an header is already in the environ. If this is the case, then I can combine the new value with the old one. The problem, is that I have first to check if the header can be combined (and the Cookie must be combined using ';' instead of ','). Luckily some of these headers can be handled internally by Nginx. How many browsers split an header on multiple lines? Regards Manlio Perillo From pje at telecommunity.com Tue Oct 2 23:08:53 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 02 Oct 2007 17:08:53 -0400 Subject: [Web-SIG] Multiple message-header fields handling In-Reply-To: <4702A9A0.2090005@libero.it> References: <47029C66.5090408@libero.it> <4702A9A0.2090005@libero.it> Message-ID: <20071002210615.E678F3A407A@sparrow.telecommunity.com> At 10:27 PM 10/2/2007 +0200, Manlio Perillo wrote: >Robert Brewer ha scritto: > > > > [...] > > As you note, it's part of the HTTP spec that such headers > > can be combined without changing the semantics. Here's a list of the > > headers that need to be folded: > > > > comma_separated_headers = ['ACCEPT', 'ACCEPT-CHARSET', 'ACCEPT-ENCODING', > > 'ACCEPT-LANGUAGE', 'ACCEPT-RANGES', 'ALLOW', 'CACHE-CONTROL', > > 'CONNECTION', 'CONTENT-ENCODING', 'CONTENT-LANGUAGE', 'EXPECT', > > 'IF-MATCH', 'IF-NONE-MATCH', 'PRAGMA', 'PROXY-AUTHENTICATE', 'TE', > > 'TRAILER', 'TRANSFER-ENCODING', 'UPGRADE', 'VARY', 'VIA', 'WARNING', > > 'WWW-AUTHENTICATE'] > > > >Note that some of these headers are response headers, and it is >responsibility of the WSGI application to properly folding them, and not >of the WSGI gateway. On the contrary. The gateway is responsible for sending *all* the header lines to the client. If you're only taking the last one, your gateway is non-compliant. If nginx can't handle multiple headers, the only way you can be WSGI compliant is to do the folding in the gateway, because the application is explicitly allowed to provide multiple header values for a given header name. From manlio_perillo at libero.it Tue Oct 2 23:35:51 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Tue, 02 Oct 2007 23:35:51 +0200 Subject: [Web-SIG] Multiple message-header fields handling In-Reply-To: <20071002210615.E678F3A407A@sparrow.telecommunity.com> References: <47029C66.5090408@libero.it> <4702A9A0.2090005@libero.it> <20071002210615.E678F3A407A@sparrow.telecommunity.com> Message-ID: <4702B9B7.7020101@libero.it> Phillip J. Eby ha scritto: > [...] >> Note that some of these headers are response headers, and it is >> responsibility of the WSGI application to properly folding them, and not >> of the WSGI gateway. > > On the contrary. The gateway is responsible for sending *all* the > header lines to the client. If you're only taking the last one, your > gateway is non-compliant. > You are right, sorry. I forgot that start_application returns a list, and not a dict. The current implementation of mod_wsgi is compliant here, and the headers are combined. > [...] Regards Manlio Perillo From manlio_perillo at libero.it Wed Oct 3 13:35:16 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Wed, 03 Oct 2007 13:35:16 +0200 Subject: [Web-SIG] Multiple message-header fields handling In-Reply-To: <4702B9B7.7020101@libero.it> References: <47029C66.5090408@libero.it> <4702A9A0.2090005@libero.it> <20071002210615.E678F3A407A@sparrow.telecommunity.com> <4702B9B7.7020101@libero.it> Message-ID: <47037E74.8050400@libero.it> Manlio Perillo ha scritto: > Phillip J. Eby ha scritto: >> [...] >>> Note that some of these headers are response headers, and it is >>> responsibility of the WSGI application to properly folding them, and not >>> of the WSGI gateway. >> On the contrary. The gateway is responsible for sending *all* the >> header lines to the client. If you're only taking the last one, your >> gateway is non-compliant. >> > > You are right, sorry. > I forgot that start_application returns a list, and not a dict. > > The current implementation of mod_wsgi is compliant here, and the > headers are combined. > A correction: Nginx does not "folds" the multiline headers, they where folded by Firefox. Regards Manlio Perillo From manlio_perillo at libero.it Wed Oct 3 16:57:37 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Wed, 03 Oct 2007 16:57:37 +0200 Subject: [Web-SIG] [extension] x-wsgiorg.flush Message-ID: <4703ADE1.5040507@libero.it> Hi. Nginx, in one of the headers filters, can do ETag and Last-Modified validation. I want to be able to use this feature, so I don't have to use thirdy party solutions. However with the current WSGI implementation this is not possible. A possibile solution can be to add an extension `x-wsgiorg.flush`, a callable object that notify the WSGI gateway that it can flush the headers (if they are not yet be sent) or the output buffer (Nginx has this feature, however I have yet not understand how it works). start_response('200 Ok', [('Last-Modified', 'xxx')]) ... environ['x-wsgiorg.flush']() return a-generator The WSGI gateway can now send the headers before iterating over the generator, and if the client content is up-to-date, the new content is never generated. The intent of this extension is to be transparent to the WSGI application. In case of nginx mod_wsgi, the validation can be done by Nginx, but for generic WSGI applications this can be done by a middleware. I don't know if this feature is feasible, since I have not yet implemented it, so I would like to receive some feedbacks. Thanks Manlio Perillo From pje at telecommunity.com Wed Oct 3 18:52:57 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 03 Oct 2007 12:52:57 -0400 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <4703ADE1.5040507@libero.it> References: <4703ADE1.5040507@libero.it> Message-ID: <20071003165020.23FAA3A407A@sparrow.telecommunity.com> Thinking about this made me realize that WSGI 2.0 isn't going to be able to validate *anything* about a response by raising an error in the application, because everything is done after the code returns. That suggests to me that these sorts of errors should be handled by changing the response sent to the browser, instead. That is, sending an internal error message to the browser and logging details of the problem. At 04:57 PM 10/3/2007 +0200, Manlio Perillo wrote: >Hi. > >Nginx, in one of the headers filters, can do ETag and Last-Modified >validation. > >I want to be able to use this feature, so I don't have to use thirdy >party solutions. > >However with the current WSGI implementation this is not possible. > >A possibile solution can be to add an extension `x-wsgiorg.flush`, a >callable object that notify the WSGI gateway that it can flush the >headers (if they are not yet be sent) or the output buffer (Nginx has >this feature, however I have yet not understand how it works). > > start_response('200 Ok', [('Last-Modified', 'xxx')]) > > ... > environ['x-wsgiorg.flush']() > > return a-generator > > >The WSGI gateway can now send the headers before iterating over the >generator, and if the client content is up-to-date, the new content is >never generated. > > > >The intent of this extension is to be transparent to the WSGI application. >In case of nginx mod_wsgi, the validation can be done by Nginx, but for >generic WSGI applications this can be done by a middleware. > > >I don't know if this feature is feasible, since I have not yet >implemented it, so I would like to receive some feedbacks. > > >Thanks Manlio Perillo >_______________________________________________ >Web-SIG mailing list >Web-SIG at python.org >Web SIG: http://www.python.org/sigs/web-sig >Unsubscribe: >http://mail.python.org/mailman/options/web-sig/pje%40telecommunity.com From manlio_perillo at libero.it Wed Oct 3 19:03:46 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Wed, 03 Oct 2007 19:03:46 +0200 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <20071003165020.23FAA3A407A@sparrow.telecommunity.com> References: <4703ADE1.5040507@libero.it> <20071003165020.23FAA3A407A@sparrow.telecommunity.com> Message-ID: <4703CB72.6080308@libero.it> Phillip J. Eby ha scritto: > Thinking about this made me realize that WSGI 2.0 isn't going to be able > to validate *anything* about a response by raising an error in the > application, because everything is done after the code returns. > In this case, if the cache validation fails, we just have to generate the body content. For which cases do you want to raise an exception? > That suggests to me that these sorts of errors should be handled by > changing the response sent to the browser, instead. Right. In this case Nginx, when the cache is fresh, should change the response code from 200 (OK) to 304 (Not Modified). If I'm right, the current WSGI spec does not forbids or allows this. > That is, sending an > internal error message to the browser and logging details of the problem. > Regards Manlio Perillo From pje at telecommunity.com Wed Oct 3 20:00:48 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 03 Oct 2007 14:00:48 -0400 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <4703CB72.6080308@libero.it> References: <4703ADE1.5040507@libero.it> <20071003165020.23FAA3A407A@sparrow.telecommunity.com> <4703CB72.6080308@libero.it> Message-ID: <20071003175813.7DCEA3A407A@sparrow.telecommunity.com> At 07:03 PM 10/3/2007 +0200, Manlio Perillo wrote: >Phillip J. Eby ha scritto: > > Thinking about this made me realize that WSGI 2.0 isn't going to be able > > to validate *anything* about a response by raising an error in the > > application, because everything is done after the code returns. > > > >In this case, if the cache validation fails, we just have to generate >the body content. > >For which cases do you want to raise an exception? Sorry, I thought you were talking about validating headers for *errors* (e.g. WSGI compliance problems), not providing special support for If-* headers. I don't think there's any point to having a WSGI extension for If-* header support. All the necessary data is in the environment, so it can trivially be implemented as a library or middleware, especially if the application postpones body content generation to an iterator. Since WSGI is intended to reduce web framework proliferation, one should never implement with middleware or a WSGI extension anything that can just be released as a library for others to use. > > That suggests to me that these sorts of errors should be handled by > > changing the response sent to the browser, instead. > >Right. >In this case Nginx, when the cache is fresh, should change the response >code from 200 (OK) to 304 (Not Modified). > >If I'm right, the current WSGI spec does not forbids or allows this. Actually, I was talking about handling the case of an invalid (ie. non-WSGI/HTTP compliant) header, not cache handling. Sorry for the confusion. From manlio_perillo at libero.it Wed Oct 3 20:24:05 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Wed, 03 Oct 2007 20:24:05 +0200 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <20071003175813.7DCEA3A407A@sparrow.telecommunity.com> References: <4703ADE1.5040507@libero.it> <20071003165020.23FAA3A407A@sparrow.telecommunity.com> <4703CB72.6080308@libero.it> <20071003175813.7DCEA3A407A@sparrow.telecommunity.com> Message-ID: <4703DE45.6010606@libero.it> Phillip J. Eby ha scritto: > At 07:03 PM 10/3/2007 +0200, Manlio Perillo wrote: >> Phillip J. Eby ha scritto: >> > Thinking about this made me realize that WSGI 2.0 isn't going to be >> able >> > to validate *anything* about a response by raising an error in the >> > application, because everything is done after the code returns. >> > >> >> In this case, if the cache validation fails, we just have to generate >> the body content. >> >> For which cases do you want to raise an exception? > > Sorry, I thought you were talking about validating headers for *errors* > (e.g. WSGI compliance problems), not providing special support for If-* > headers. > Ok, my message was not very clear. > I don't think there's any point to having a WSGI extension for If-* > header support. All the necessary data is in the environment, so it can > trivially be implemented as a library or middleware, especially if the > application postpones body content generation to an iterator. > > Since WSGI is intended to reduce web framework proliferation, one should > never implement with middleware or a WSGI extension anything that can > just be released as a library for others to use. > In general this is true, however to add support for If- headers, I do not have to write any code, all I need is to be able to send the headers before the body content is generated. A wsgiorg.flush extension can be useful for some other things. As an example, when in Nginx we send some data, an output buffer like gzip can buffer data for efficienty, and with wsgiorg.flush a WSGI application can force the buffer to be flushed (ok, the WSGI already states that the WSGI gateway should not buffer the data). Note that in Nginx, unlike Apache, an output buffer can process a partial buffer, so, for a WSGI application like: start_response('200 OK', [...]) yield 'xxx' yield 'yyy' yield 'zzz' the 'xxx' string is sent to the next output buffer, and, finally it is sent to the client. Now can happens that the socket is not ready to send further data, so the application must be paused until the socket is ready. When the socket is ready, the next buffer can be sent to the next outpup buffer, and so on. NOTE: this is not yet implemented in nginx mod_wsgi. > [...] Regards Manlio Perillo From manlio_perillo at libero.it Wed Oct 3 20:33:48 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Wed, 03 Oct 2007 20:33:48 +0200 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <4703DE45.6010606@libero.it> References: <4703ADE1.5040507@libero.it> <20071003165020.23FAA3A407A@sparrow.telecommunity.com> <4703CB72.6080308@libero.it> <20071003175813.7DCEA3A407A@sparrow.telecommunity.com> <4703DE45.6010606@libero.it> Message-ID: <4703E08C.2070704@libero.it> Manlio Perillo ha scritto: > [...] > Note that in Nginx, unlike Apache, an output buffer can process a > partial buffer, Sorry, this is not correct. The only difference from Apache, here, is that the data is written asynchronously. Manlio Perillo From pje at telecommunity.com Wed Oct 3 21:23:32 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 03 Oct 2007 15:23:32 -0400 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <4703DE45.6010606@libero.it> References: <4703ADE1.5040507@libero.it> <20071003165020.23FAA3A407A@sparrow.telecommunity.com> <4703CB72.6080308@libero.it> <20071003175813.7DCEA3A407A@sparrow.telecommunity.com> <4703DE45.6010606@libero.it> Message-ID: <20071003192055.49B203A407A@sparrow.telecommunity.com> At 08:24 PM 10/3/2007 +0200, Manlio Perillo wrote: >WSGI already >states that the WSGI gateway should not buffer the data). It does not state that at all. It states that a gateway *must not delay the transmission of any block*. That requirement is not a "should" but a "must", and it does not directly state anything about buffering, one way or the other. It *does*, however, imply that buffering is only acceptable if the buffer is being asynchronously emptied, via another thread or the OS emptying its own OS-level buffers. (e.g. if you're using synchronous sockets) >Note that in Nginx, unlike Apache, an output buffer can process a >partial buffer, so, for a WSGI application like: > > start_response('200 OK', [...]) > > yield 'xxx' > yield 'yyy' > yield 'zzz' > > >the 'xxx' string is sent to the next output buffer, and, finally it is >sent to the client. > >Now can happens that the socket is not ready to send further data, so >the application must be paused until the socket is ready. > >When the socket is ready, the next buffer can be sent to the next outpup >buffer, and so on. In the above code, when "yield 'yyy'" is invoked, one of two conditions must apply. Either: 1. the 'xxx' has been sent to the OS, OR 2. it is still being sent in the background by another thread If it is possible to execute the "yield 'yyy'" line without one of these conditions applying, the gateway is *not* WSGI compliant. From pje at telecommunity.com Wed Oct 3 21:30:55 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 03 Oct 2007 15:30:55 -0400 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <4703ADE1.5040507@libero.it> References: <4703ADE1.5040507@libero.it> Message-ID: <20071003192817.3014C3A407A@sparrow.telecommunity.com> At 04:57 PM 10/3/2007 +0200, Manlio Perillo wrote: >A possibile solution can be to add an extension `x-wsgiorg.flush`, a >callable object that notify the WSGI gateway that it can flush the >headers (if they are not yet be sent) or the output buffer (Nginx has >this feature, however I have yet not understand how it works). > > start_response('200 Ok', [('Last-Modified', 'xxx')]) > > ... > environ['x-wsgiorg.flush']() > > return a-generator > > >The WSGI gateway can now send the headers before iterating over the >generator, and if the client content is up-to-date, the new content is >never generated. Now that I understand what this is for, I can explain why a WSGI extension is not necessary to provide this feature. In a compliant WSGI gateway, yielding an empty string from 'a-generator' is sufficient to "flush" the WSGI pipeline. I suggest that you read this section of the spec more carefully: http://www.python.org/dev/peps/pep-0333/#buffering-and-streaming From manlio_perillo at libero.it Wed Oct 3 21:52:01 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Wed, 03 Oct 2007 21:52:01 +0200 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <20071003192817.3014C3A407A@sparrow.telecommunity.com> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> Message-ID: <4703F2E1.9050402@libero.it> Phillip J. Eby ha scritto: > [...] > > Now that I understand what this is for, I can explain why a WSGI > extension is not necessary to provide this feature. In a compliant WSGI > gateway, yielding an empty string from 'a-generator' is sufficient to > "flush" the WSGI pipeline. > But the WSGI pipeline should already be flushed for every string yielded, right? An interesting "extension" for an asynchronous WSGI gateway is to "suspend" the iteration when an empty string is returned, creating a timer that fires after 0 milliseconds (in Twisted, this is the same as callLater(0, ...)) > I suggest that you read this section of the spec more carefully: > > http://www.python.org/dev/peps/pep-0333/#buffering-and-streaming > There is a problem here: a WSGI gateway is not allowed to send headers until the app_iter yields a non empty string or the iterator is exausted. Regards Manlio Perillo From manlio_perillo at libero.it Wed Oct 3 21:58:24 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Wed, 03 Oct 2007 21:58:24 +0200 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <20071003192055.49B203A407A@sparrow.telecommunity.com> References: <4703ADE1.5040507@libero.it> <20071003165020.23FAA3A407A@sparrow.telecommunity.com> <4703CB72.6080308@libero.it> <20071003175813.7DCEA3A407A@sparrow.telecommunity.com> <4703DE45.6010606@libero.it> <20071003192055.49B203A407A@sparrow.telecommunity.com> Message-ID: <4703F460.4080401@libero.it> Phillip J. Eby ha scritto: > At 08:24 PM 10/3/2007 +0200, Manlio Perillo wrote: >> WSGI already >> states that the WSGI gateway should not buffer the data). > > It does not state that at all. It states that a gateway *must not delay > the transmission of any block*. That requirement is not a "should" but > a "must", and it does not directly state anything about buffering, one > way or the other. > > It *does*, however, imply that buffering is only acceptable if the > buffer is being asynchronously emptied, via another thread or the OS > emptying its own OS-level buffers. (e.g. if you're using synchronous > sockets) > Ok. > >> Note that in Nginx, unlike Apache, an output buffer can process a >> partial buffer, so, for a WSGI application like: >> >> start_response('200 OK', [...]) >> >> yield 'xxx' >> yield 'yyy' >> yield 'zzz' >> >> >> the 'xxx' string is sent to the next output buffer, and, finally it is >> sent to the client. >> >> Now can happens that the socket is not ready to send further data, so >> the application must be paused until the socket is ready. >> >> When the socket is ready, the next buffer can be sent to the next outpup >> buffer, and so on. > > In the above code, when "yield 'yyy'" is invoked, one of two conditions > must apply. Either: > > 1. the 'xxx' has been sent to the OS, OR > 2. it is still being sent in the background by another thread > > If it is possible to execute the "yield 'yyy'" line without one of these > conditions applying, the gateway is *not* WSGI compliant. > I'm not sure, but I think that the 'xxx' can be still in one of the output filter buffers (like gzip), unless we explicitly require it to be flushed. Nginx does not use threads. By the way: I think that the environ dictionary should contain a new wsgi.asynchronous value, that should evaluate true if the WSGI gateway is asynchronous. This may be necessary, because a WSGI application should know that it can be suspended, even if it not requested it. > Regards Manlio Perillo From pje at telecommunity.com Thu Oct 4 01:10:49 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 03 Oct 2007 19:10:49 -0400 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <4703F2E1.9050402@libero.it> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> <4703F2E1.9050402@libero.it> Message-ID: <20071003230812.7A7F63A407A@sparrow.telecommunity.com> At 09:52 PM 10/3/2007 +0200, Manlio Perillo wrote: >Phillip J. Eby ha scritto: > > [...] > > > > Now that I understand what this is for, I can explain why a WSGI > > extension is not necessary to provide this feature. In a compliant WSGI > > gateway, yielding an empty string from 'a-generator' is sufficient to > > "flush" the WSGI pipeline. > > > >But the WSGI pipeline should already be flushed for every string >yielded, right? > >An interesting "extension" for an asynchronous WSGI gateway is to >"suspend" the iteration when an empty string is returned, creating a >timer that fires after 0 milliseconds (in Twisted, this is the same as >callLater(0, ...)) > > > I suggest that you read this section of the spec more carefully: > > > > http://www.python.org/dev/peps/pep-0333/#buffering-and-streaming > > > >There is a problem here: a WSGI gateway is not allowed to send headers >until the app_iter yields a non empty string or the iterator is exausted. Argh. You're right. I forgot about that bit. It has been a few too many years since I worked on the spec. :) Still, this is yet another example of why WSGI 2.0 is a big improvement in simplicity. So I still would rather see more effort put into getting WSGI 2.0 written and into widespread use, than creating niche extensions to 1.0. From ianb at colorstudy.com Thu Oct 4 01:13:49 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Wed, 03 Oct 2007 19:13:49 -0400 Subject: [Web-SIG] WSGI 2.0 Message-ID: <4704222D.30208@colorstudy.com> PJE wants to talk about WSGI 2. That's cool; I remind everyone that there's a page to bring up issues you might want to discuss for 2.0: http://wsgi.org/wsgi/WSGI_2.0 Feel free to add to, or discuss, issues on that page... Ian From graham.dumpleton at gmail.com Thu Oct 4 04:30:28 2007 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Thu, 4 Oct 2007 12:30:28 +1000 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <20071003230812.7A7F63A407A@sparrow.telecommunity.com> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> <4703F2E1.9050402@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> Message-ID: <88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com> On 04/10/2007, Phillip J. Eby wrote: > At 09:52 PM 10/3/2007 +0200, Manlio Perillo wrote: > >Phillip J. Eby ha scritto: > > > [...] > > > > > > Now that I understand what this is for, I can explain why a WSGI > > > extension is not necessary to provide this feature. In a compliant WSGI > > > gateway, yielding an empty string from 'a-generator' is sufficient to > > > "flush" the WSGI pipeline. > > > > > > >But the WSGI pipeline should already be flushed for every string > >yielded, right? > > > >An interesting "extension" for an asynchronous WSGI gateway is to > >"suspend" the iteration when an empty string is returned, creating a > >timer that fires after 0 milliseconds (in Twisted, this is the same as > >callLater(0, ...)) > > > > > I suggest that you read this section of the spec more carefully: > > > > > > http://www.python.org/dev/peps/pep-0333/#buffering-and-streaming > > > > > > >There is a problem here: a WSGI gateway is not allowed to send headers > >until the app_iter yields a non empty string or the iterator is exausted. > > Argh. You're right. I forgot about that bit. It has been a few too > many years since I worked on the spec. :) The actual wording of the PEP does though suggest that if one calls write() returned from start_response() that one would flush headers. Ie., the requirement for a non-empty string is really only mentioned in reference to value returned from iterable and not in relation to empty data string passed to write(). I am not sure I understand the importance of being strict and not flushing headers until the first non-empty content data block. Was there a specific reasoning or use case behind saying that? Graham From manlio_perillo at libero.it Thu Oct 4 10:57:08 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Thu, 04 Oct 2007 10:57:08 +0200 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <20071003230812.7A7F63A407A@sparrow.telecommunity.com> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> <4703F2E1.9050402@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> Message-ID: <4704AAE4.1010708@libero.it> Phillip J. Eby ha scritto: > [...] >> There is a problem here: a WSGI gateway is not allowed to send headers >> until the app_iter yields a non empty string or the iterator is exausted. > > Argh. You're right. I forgot about that bit. It has been a few too > many years since I worked on the spec. :) > 07-Dec-2003! And yet it seems that WSGI is not pervasively used. > Still, this is yet another example of why WSGI 2.0 is a big improvement > in simplicity. So I still would rather see more effort put into getting > WSGI 2.0 written and into widespread use, than creating niche extensions > to 1.0. My implementation of mod_wsgi for nginx implements WSGI 2.0, and now I'm removing the limitation that the app_iter must yield only one item. However there is a problem with WSGI 2.0. Suppose that I execute an asynchronous HTTP request to obtain some data from a remote server. I can use the yet to be implemented wsgi.pause_output extension for this, or an extension for interfacing with nginx subrequest API. What happens if the HTTP request returns a 404 and I want to return this status code to the original client? This can be done in WSGI 1.0 (since I can call start_response in the app_iter generator) but cannot be done in WSGI 2.0. A possibile solution for WSGI 2.0 is to add a wsgi.response_error exception: raise environ['wsgi.response_error'](status='404 Not Found) However there is still the problem with the headers. Regards Manlio Perillo From pje at telecommunity.com Thu Oct 4 13:47:15 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 04 Oct 2007 07:47:15 -0400 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.co m> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> <4703F2E1.9050402@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> <88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com> Message-ID: <20071004114441.C7B103A407A@sparrow.telecommunity.com> At 12:30 PM 10/4/2007 +1000, Graham Dumpleton wrote: >On 04/10/2007, Phillip J. Eby wrote: > > At 09:52 PM 10/3/2007 +0200, Manlio Perillo wrote: > > >There is a problem here: a WSGI gateway is not allowed to send headers > > >until the app_iter yields a non empty string or the iterator is exausted. > > > > Argh. You're right. I forgot about that bit. It has been a few too > > many years since I worked on the spec. :) > >The actual wording of the PEP does though suggest that if one calls >write() returned from start_response() that one would flush headers. >Ie., the requirement for a non-empty string is really only mentioned >in reference to value returned from iterable and not in relation to >empty data string passed to write(). > >I am not sure I understand the importance of being strict and not >flushing headers until the first non-empty content data block. Was >there a specific reasoning or use case behind saying that? The idea was to allow an application to change its mind about the headers until it had committed to writing data. That is, to allow the application to do error handling for as long as possible before the server has to do it. For WSGI 2.0, I'm no longer concerned about it - in the common case, the body will be a list or tuple containing a single string, so it can't possibly raise an error. For anything more complex, well, you were going to have to handle error conditions once you yielded some body output anyway. Now that you're mentioning it, the "non-empty yield" requirement seems pretty bogus, since it's not really possible for the app to tell whether headers have been sent anyway; start_response() handles that transparently. Only problem is that the PEP examples and wsgiref aren't written to support doing it that way, so I don't think we can reasonably change it in WSGI 1.0, and in 2.0 it won't even matter. From pje at telecommunity.com Thu Oct 4 13:54:43 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 04 Oct 2007 07:54:43 -0400 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <4704AAE4.1010708@libero.it> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> <4703F2E1.9050402@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> <4704AAE4.1010708@libero.it> Message-ID: <20071004115207.65C463A407A@sparrow.telecommunity.com> At 10:57 AM 10/4/2007 +0200, Manlio Perillo wrote: >Phillip J. Eby ha scritto: > > [...] > >> There is a problem here: a WSGI gateway is not allowed to send headers > >> until the app_iter yields a non empty string or the iterator is exausted. > > > > Argh. You're right. I forgot about that bit. It has been a few too > > many years since I worked on the spec. :) > > > >07-Dec-2003! >And yet it seems that WSGI is not pervasively used. What do you mean? Can you name a popular Python web framework or library that doesn't either use or support WSGI? > > Still, this is yet another example of why WSGI 2.0 is a big improvement > > in simplicity. So I still would rather see more effort put into getting > > WSGI 2.0 written and into widespread use, than creating niche extensions > > to 1.0. > > >My implementation of mod_wsgi for nginx implements WSGI 2.0, and now I'm >removing the limitation that the app_iter must yield only one item. Eh? I don't understand what you mean by "app_iter must yield only one item". In WSGI 2.0 the application return signature is a three-item tuple, the third item of which is a WSGI 1.0 response object. >However there is a problem with WSGI 2.0. > >Suppose that I execute an asynchronous HTTP request to obtain some data >from a remote server. > >I can use the yet to be implemented wsgi.pause_output extension for >this, or an extension for interfacing with nginx subrequest API. That won't be possible in WSGI 2.0 - it's a purely synchronous API. You can pause body output by yielding empty strings, but you need to have already decided on your headers. >What happens if the HTTP request returns a 404 and I want to return this >status code to the original client? > >This can be done in WSGI 1.0 (since I can call start_response in the >app_iter generator) but cannot be done in WSGI 2.0. In WSGI 1.0, that can only happen up until the point where you've yielded body output. As soon as there is any body output, the headers are committed. In 2.0, you will have to commit your headers at return time. Note, by the way, that WSGI 2.0 isn't going to be an immediate or complete replacement for 1.0 -- especially since the spec isn't written yet! 1.0 apps and servers will likely be with us for a few years yet. From graham.dumpleton at gmail.com Thu Oct 4 14:20:55 2007 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Thu, 4 Oct 2007 22:20:55 +1000 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <20071004114441.C7B103A407A@sparrow.telecommunity.com> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> <4703F2E1.9050402@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> <88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com> <20071004114441.C7B103A407A@sparrow.telecommunity.com> Message-ID: <88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com> On 04/10/2007, Phillip J. Eby wrote: > At 12:30 PM 10/4/2007 +1000, Graham Dumpleton wrote: > >On 04/10/2007, Phillip J. Eby wrote: > > > At 09:52 PM 10/3/2007 +0200, Manlio Perillo wrote: > > > >There is a problem here: a WSGI gateway is not allowed to send headers > > > >until the app_iter yields a non empty string or the iterator is exausted. > > > > > > Argh. You're right. I forgot about that bit. It has been a few too > > > many years since I worked on the spec. :) > > > >The actual wording of the PEP does though suggest that if one calls > >write() returned from start_response() that one would flush headers. > >Ie., the requirement for a non-empty string is really only mentioned > >in reference to value returned from iterable and not in relation to > >empty data string passed to write(). > > > >I am not sure I understand the importance of being strict and not > >flushing headers until the first non-empty content data block. Was > >there a specific reasoning or use case behind saying that? > > The idea was to allow an application to change its mind about the > headers until it had committed to writing data. That is, to allow > the application to do error handling for as long as possible before > the server has to do it. But once you have called start_response() you cant call it a second time to change the values so how could the application change its mind? If you are delaying calling start_response() in the first place it is a moot point as you cant be writing data until you do so. > For WSGI 2.0, I'm no longer concerned about it - in the common case, > the body will be a list or tuple containing a single string, so it > can't possibly raise an error. For anything more complex, well, you > were going to have to handle error conditions once you yielded some > body output anyway. > > Now that you're mentioning it, the "non-empty yield" requirement > seems pretty bogus, since it's not really possible for the app to > tell whether headers have been sent anyway; start_response() handles > that transparently. > > Only problem is that the PEP examples and wsgiref aren't written to > support doing it that way, so I don't think we can reasonably change > it in WSGI 1.0, and in 2.0 it won't even matter. Huh, change what in WSGI 1.0. As you seem to note the CGI example in the PEP does flush headers even if first data block was an empty string and quite likely that other implementations have copied from that and not implemented the WSGI specification as written. As to Apache mod_wsgi, if using Apache 1.3 it would flush headers if first data block output was empty where as in Apache 2.X it will only flush when first non empty data block is yielded, but also wouldn't flush if write() was being called. That in Apache 2.X it doesn't flush headers until first non empty data block is output wasn't by design, that is just how Apache works under the covers. So most likely no one probably gets it exactly right per spec, but in practice it probably doesn't matter anyway and isn't going to affect how anything works. Graham From pje at telecommunity.com Thu Oct 4 15:10:53 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 04 Oct 2007 09:10:53 -0400 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.co m> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> <4703F2E1.9050402@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> <88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com> <20071004114441.C7B103A407A@sparrow.telecommunity.com> <88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com> Message-ID: <20071004130818.BFCE83A407A@sparrow.telecommunity.com> At 10:20 PM 10/4/2007 +1000, Graham Dumpleton wrote: >On 04/10/2007, Phillip J. Eby wrote: > > At 12:30 PM 10/4/2007 +1000, Graham Dumpleton wrote: > > >On 04/10/2007, Phillip J. Eby wrote: > > > > At 09:52 PM 10/3/2007 +0200, Manlio Perillo wrote: > > > > >There is a problem here: a WSGI gateway is not allowed to send headers > > > > >until the app_iter yields a non empty string or the iterator > is exausted. > > > > > > > > Argh. You're right. I forgot about that bit. It has been a few too > > > > many years since I worked on the spec. :) > > > > > >The actual wording of the PEP does though suggest that if one calls > > >write() returned from start_response() that one would flush headers. > > >Ie., the requirement for a non-empty string is really only mentioned > > >in reference to value returned from iterable and not in relation to > > >empty data string passed to write(). > > > > > >I am not sure I understand the importance of being strict and not > > >flushing headers until the first non-empty content data block. Was > > >there a specific reasoning or use case behind saying that? > > > > The idea was to allow an application to change its mind about the > > headers until it had committed to writing data. That is, to allow > > the application to do error handling for as long as possible before > > the server has to do it. > >But once you have called start_response() you cant call it a second >time to change the values You can, as long as you pass in the exception info -- because an exception is the only legitimate reason to change the values. > > For WSGI 2.0, I'm no longer concerned about it - in the common case, > > the body will be a list or tuple containing a single string, so it > > can't possibly raise an error. For anything more complex, well, you > > were going to have to handle error conditions once you yielded some > > body output anyway. > > > > Now that you're mentioning it, the "non-empty yield" requirement > > seems pretty bogus, since it's not really possible for the app to > > tell whether headers have been sent anyway; start_response() handles > > that transparently. > > > > Only problem is that the PEP examples and wsgiref aren't written to > > support doing it that way, so I don't think we can reasonably change > > it in WSGI 1.0, and in 2.0 it won't even matter. > >Huh, change what in WSGI 1.0. As you seem to note the CGI example in >the PEP does flush headers even if first data block was an empty >string Actually, the PEP example skips empty strings yielded by the app_iter. wsgiref.handlers, OTOH, doesn't do this, now that I've checked it. >and quite likely that other implementations have copied from >that and not implemented the WSGI specification as written. Correct WSGI 1.0 implementations are unfortunately rare. Even wsgiref gets it wrong. :( >So most likely no one probably gets it exactly right per spec, No kidding! >but in >practice it probably doesn't matter anyway and isn't going to affect >how anything works. Yep, but another argument in favor of getting rid of as much statefulness from the protocol as we can. Making the status and headers part of the return value entirely eliminates the question of when they're going to get written, and whether they can be changed. (As a side benefit, making the return a 3-tuple makes it impossible to write a WSGI app using a single generator -- thereby discouraging people from using 'yield' like it was a CGI "print".) From manlio_perillo at libero.it Thu Oct 4 15:47:06 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Thu, 04 Oct 2007 15:47:06 +0200 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <4704222D.30208@colorstudy.com> References: <4704222D.30208@colorstudy.com> Message-ID: <4704EEDA.1010800@libero.it> Ian Bicking ha scritto: > PJE wants to talk about WSGI 2. That's cool; I remind everyone that > there's a page to bring up issues you might want to discuss for 2.0: > http://wsgi.org/wsgi/WSGI_2.0 > > Feel free to add to, or discuss, issues on that page... > I'll write my ideas here: 1) start_response should no more return a write callable. I don't know how many application use it, but I think that I can't implement it in a conforming way for nginx mod_wsgi, so I will not implement it. 2) start_response should no more accept a exc_info parameter. I don't know how many applications use it, but I think that WSGI applications should not change their mind. They should delay calling start_response until they are able to produce a "final" response. 3) start_response should accept, as an optional parameter, a flush argument. flush default to False, and when it is True, the WSGI gateway must write the headers as soon as start_response is called. 4) the environ dictionary should have a new WSGI-defined variable: wsgi.asynchronous. This value should evaluate to true when the server is asynchonous, that is, the WSGI application is executed in the main process loop of the server and the WSGI application can be paused after it yields some data. 5) clarify some points in the WSGI 1.0 spec, as discussed in the latest emails > Ian Regards Manlio Perillo From manlio_perillo at libero.it Thu Oct 4 15:53:04 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Thu, 04 Oct 2007 15:53:04 +0200 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <20071004115207.65C463A407A@sparrow.telecommunity.com> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> <4703F2E1.9050402@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> <4704AAE4.1010708@libero.it> <20071004115207.65C463A407A@sparrow.telecommunity.com> Message-ID: <4704F040.10105@libero.it> Phillip J. Eby ha scritto: > At 10:57 AM 10/4/2007 +0200, Manlio Perillo wrote: >> Phillip J. Eby ha scritto: >> > [...] >> >> There is a problem here: a WSGI gateway is not allowed to send headers >> >> until the app_iter yields a non empty string or the iterator is >> exausted. >> > >> > Argh. You're right. I forgot about that bit. It has been a few too >> > many years since I worked on the spec. :) >> > >> >> 07-Dec-2003! >> And yet it seems that WSGI is not pervasively used. > > What do you mean? Can you name a popular Python web framework or > library that doesn't either use or support WSGI? > Django, as an example, uses WSGI "only as a backend". Django design is not based on WSGI, it is WSGI that is adapted for Django. An interesting example: to add support for CGI, it seems that the preferred method is to add a direct Django adapter for CGI, instead of using a WSGI adatper for CGI. > >> > Still, this is yet another example of why WSGI 2.0 is a big improvement >> > in simplicity. So I still would rather see more effort put into >> getting >> > WSGI 2.0 written and into widespread use, than creating niche >> extensions >> > to 1.0. >> >> >> My implementation of mod_wsgi for nginx implements WSGI 2.0, and now I'm >> removing the limitation that the app_iter must yield only one item. > > Eh? I don't understand what you mean by "app_iter must yield only one > item". return '200 OK', [('Content-Type', 'text/plain')], ['a', 'b'] is not allowed. The response object can be a generic iterator, however. > In WSGI 2.0 the application return signature is a three-item > tuple, the third item of which is a WSGI 1.0 response object. > > >> However there is a problem with WSGI 2.0. >> >> Suppose that I execute an asynchronous HTTP request to obtain some data >> from a remote server. >> >> I can use the yet to be implemented wsgi.pause_output extension for >> this, or an extension for interfacing with nginx subrequest API. > > That won't be possible in WSGI 2.0 - it's a purely synchronous API. This is the reason why I don't like WSGI 2.0 :). However I have to admit that developing a full asynchronous application is not easy, notably when we have to interact with a database and a transaction. It is really so hard to implement WSGI 1.0 and to write middlewares for it? Is this really causing problems for WSGI adoption? I think that WSGI 2.0 should simply correct some problems in WSGI 1.0, and clarify some points, since now we have a WSGI implementation for Apache and Nginx. > You > can pause body output by yielding empty strings, but you need to have > already decided on your headers. > And this will make asynchronous applications not really useful, IMHO... But here I will say more once I'll implement some asynchronous extensions for nginx mod_wsgi. It's very unfortunate that the WSGI implementation in Twisted just uses threads instead of doing some experimentation. > [...] Regards Manlio Perillo From manlio_perillo at libero.it Thu Oct 4 16:10:39 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Thu, 04 Oct 2007 16:10:39 +0200 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> <4703F2E1.9050402@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> <88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com> <20071004114441.C7B103A407A@sparrow.telecommunity.com> <88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com> Message-ID: <4704F45F.8020301@libero.it> Graham Dumpleton ha scritto: > [...] >> The idea was to allow an application to change its mind about the >> headers until it had committed to writing data. That is, to allow >> the application to do error handling for as long as possible before >> the server has to do it. > > But once you have called start_response() you cant call it a second > time to change the values so how could the application change its > mind? In my implementation of WSGI for nginx, start_response setups the headers on the request object, but calls ngx_http_send_header only when the first not empty string is yielded. This means that if an error occurs, the "old" headers are kept in the response (and sent to the client); nginx will simply change the status code to '500 INTERNAL ERROR'. A solution can be to copy the headers in a temporary request object, but I don't know if this is possible. Another solution is to setup the headers and call send_headers at the same time, but in this way it is no more possible to raise an exception when the application calls start_response with incorrect headers. If I'm right this is the solution used by Apache mod_wsgi. [...] Regards Manlio Perillo From pje at telecommunity.com Thu Oct 4 16:29:27 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 04 Oct 2007 10:29:27 -0400 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <4704EEDA.1010800@libero.it> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> Message-ID: <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> At 03:47 PM 10/4/2007 +0200, Manlio Perillo wrote: >Ian Bicking ha scritto: > > PJE wants to talk about WSGI 2. That's cool; I remind everyone that > > there's a page to bring up issues you might want to discuss for 2.0: > > http://wsgi.org/wsgi/WSGI_2.0 > > > > Feel free to add to, or discuss, issues on that page... > > > >I'll write my ideas here: >1) start_response should no more return a write callable. > I don't know how many application use it, but I think that > I can't implement it in a conforming way for nginx mod_wsgi, > so I will not implement it. > >2) start_response should no more accept a exc_info parameter. > I don't know how many applications use it, but I think that > WSGI applications should not change their mind. > They should delay calling start_response until they are able > to produce a "final" response. > >3) start_response should accept, as an optional parameter, a > flush argument. > flush default to False, and when it is True, the WSGI gateway > must write the headers as soon as start_response is called. WSGI 2.0 does not have a start_response() callable in the first place, so none of these apply. In WSGI 2.0, an application looks like this: def an_app(environ): return "200 OK", [('content-type', 'text/plain')], ["Hello, world!"] i.e., no start_response(), no write(), no statefulness at all. It just returns a tuple of (status, headers, iterable), where all three are defined by the WSGI 1.0 spec. The third item in the tuple is a WSGI 1.0 app_iter, so it can be a generator, have a close() method, etc. Here's a WSGI 1 middleware application that converts a WSGI 2 application to WSGI 1: def wsgi_1_app(environ, start_response): status, headers, body = wsgi_2_app(environ) start_response(status, headers) return body In other words, WSGI 2 is basically WSGI 1 with start_response() and write() taken out. >4) the environ dictionary should have a new WSGI-defined variable: > wsgi.asynchronous. > This value should evaluate to true when the server is asynchonous, > that is, the WSGI application is executed in the main process loop > of the server and the WSGI application can be paused after it yields > some data. It's always the case that a WSGI application can be paused after it yields data, even in WSGI 1.0. From pje at telecommunity.com Thu Oct 4 16:37:58 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 04 Oct 2007 10:37:58 -0400 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <4704F040.10105@libero.it> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> <4703F2E1.9050402@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> <4704AAE4.1010708@libero.it> <20071004115207.65C463A407A@sparrow.telecommunity.com> <4704F040.10105@libero.it> Message-ID: <20071004143521.58AE53A407A@sparrow.telecommunity.com> At 03:53 PM 10/4/2007 +0200, Manlio Perillo wrote: >Phillip J. Eby ha scritto: > > At 10:57 AM 10/4/2007 +0200, Manlio Perillo wrote: > >> Phillip J. Eby ha scritto: > >> > [...] > >> >> There is a problem here: a WSGI gateway is not allowed to send headers > >> >> until the app_iter yields a non empty string or the iterator is > >> exausted. > >> > > >> > Argh. You're right. I forgot about that bit. It has been a few too > >> > many years since I worked on the spec. :) > >> > > >> > >> 07-Dec-2003! > >> And yet it seems that WSGI is not pervasively used. > > > > What do you mean? Can you name a popular Python web framework or > > library that doesn't either use or support WSGI? > > > >Django, as an example, uses WSGI "only as a backend". That's still WSGI *support*. >Django design is not based on WSGI, it is WSGI that is adapted for Django. Yep - which is why we need WSGI 2. WSGI 1 achieved all its goals *except* for being easy to write middleware and build frameworks on it. It should be easier to use WSGI than to not use it. > > That won't be possible in WSGI 2.0 - it's a purely synchronous API. > >This is the reason why I don't like WSGI 2.0 :). > >However I have to admit that developing a full asynchronous application >is not easy, notably when we have to interact with a database and a >transaction. Right - in practice, there is not enough of a common async API for Python to make it practical to implement asynchronousness in WSGI itself. At least, in the last three years nobody has made a practical proposal for it. In practice, if you want to write a fully-async web app you must use Twisted or a similar framework and commit to using its API. You can of course still use WSGI components, but your application will not be able to run on a server that doesn't provide your async framework's API. >It is really so hard to implement WSGI 1.0 and to write middlewares for it? Absolutely. Most of the time I see someone post example middleware code, it is not WSGI compliant in some fashion. >I think that WSGI 2.0 should simply correct some problems in WSGI 1.0, The single biggest problem in WSGI 1.0 is start_response() and write(). They were hacks to support legacy applications and frameworks. >It's very unfortunate that the WSGI implementation in Twisted just uses >threads instead of doing some experimentation. You're making the assumption that no experimentation was done. Check the Web-SIG archives from three years ago and see the discussions about async APIs. From pje at telecommunity.com Thu Oct 4 16:44:15 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 04 Oct 2007 10:44:15 -0400 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <4704F45F.8020301@libero.it> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> <4703F2E1.9050402@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> <88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com> <20071004114441.C7B103A407A@sparrow.telecommunity.com> <88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com> <4704F45F.8020301@libero.it> Message-ID: <20071004144136.DC6AD3A407B@sparrow.telecommunity.com> At 04:10 PM 10/4/2007 +0200, Manlio Perillo wrote: >Graham Dumpleton ha scritto: > > [...] > >> The idea was to allow an application to change its mind about the > >> headers until it had committed to writing data. That is, to allow > >> the application to do error handling for as long as possible before > >> the server has to do it. > > > > But once you have called start_response() you cant call it a second > > time to change the values so how could the application change its > > mind? > >In my implementation of WSGI for nginx, start_response setups the >headers on the request object, but calls ngx_http_send_header only when >the first not empty string is yielded. > >This means that if an error occurs, the "old" headers are kept in the >response (and sent to the client); nginx will simply change the status >code to '500 INTERNAL ERROR'. It's not clear to me from this statement whether you're supporting the exc_info argument as described here: http://www.python.org/dev/peps/pep-0333/#the-start-response-callable and here: http://www.python.org/dev/peps/pep-0333/#error-handling From manlio_perillo at libero.it Thu Oct 4 16:48:18 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Thu, 04 Oct 2007 16:48:18 +0200 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> Message-ID: <4704FD32.9020604@libero.it> Phillip J. Eby ha scritto: > [...] > > WSGI 2.0 does not have a start_response() callable in the first place, > so none of these apply. > I thought that the current WSGI 2.0 draft was only, indeed, a draft. > > [...] >> 4) the environ dictionary should have a new WSGI-defined variable: >> wsgi.asynchronous. >> This value should evaluate to true when the server is asynchonous, >> that is, the WSGI application is executed in the main process loop >> of the server and the WSGI application can be paused after it yields >> some data. > > It's always the case that a WSGI application can be paused after it > yields data, even in WSGI 1.0. I was not aware of this. It may cause some problems to a unaware WSGI application the fact that a new "handler" is started "interleaved" with the previous ones. Regards Manlio Perillo From manlio_perillo at libero.it Thu Oct 4 17:00:35 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Thu, 04 Oct 2007 17:00:35 +0200 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <20071004143521.58AE53A407A@sparrow.telecommunity.com> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> <4703F2E1.9050402@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> <4704AAE4.1010708@libero.it> <20071004115207.65C463A407A@sparrow.telecommunity.com> <4704F040.10105@libero.it> <20071004143521.58AE53A407A@sparrow.telecommunity.com> Message-ID: <47050013.3070009@libero.it> Phillip J. Eby ha scritto: > [...] > >> However I have to admit that developing a full asynchronous application >> is not easy, notably when we have to interact with a database and a >> transaction. > > Right - in practice, there is not enough of a common async API for > Python to make it practical to implement asynchronousness in WSGI > itself. At least, in the last three years nobody has made a practical > proposal for it. In practice, if you want to write a fully-async web > app you must use Twisted or a similar framework and commit to using its > API. I want to add asynchronous API support to nginx mod_wsgi because I *want* to use a more agile web server for my applications, using Twisted only when I need an enterprise environment! > You can of course still use WSGI components, but your application > will not be able to run on a server that doesn't provide your async > framework's API. > That's not a problem. Asynchronous support will be available in nginx mod_wsgi and in Twisted (if I found the time to write an alternative implementation of the WSGI support, but this is not a priority for me). > >> It is really so hard to implement WSGI 1.0 and to write middlewares >> for it? > > Absolutely. Most of the time I see someone post example middleware > code, it is not WSGI compliant in some fashion. > Your are making a critical decision here. You are lowering the level of WSGI to match the level of average WSGI middlewares programmers. This can have disastrous conseguences if Python will gain a large user base in the future (and, of course, with a large user base, the majority of the users will have a low profile). > >> It's very unfortunate that the WSGI implementation in Twisted just uses >> threads instead of doing some experimentation. > > You're making the assumption that no experimentation was done. Check > the Web-SIG archives from three years ago and see the discussions about > async APIs. No. I have read a lot of archived messages, and all I have seen are *discussions* about asynchronous extensions, but no working implementations. Regards Manlio Perillo From manlio_perillo at libero.it Thu Oct 4 17:02:29 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Thu, 04 Oct 2007 17:02:29 +0200 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <20071004144136.DC6AD3A407B@sparrow.telecommunity.com> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> <4703F2E1.9050402@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> <88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com> <20071004114441.C7B103A407A@sparrow.telecommunity.com> <88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com> <4704F45F.8020301@libero.it> <20071004144136.DC6AD3A407B@sparrow.telecommunity.com> Message-ID: <47050085.4070502@libero.it> Phillip J. Eby ha scritto: > At 04:10 PM 10/4/2007 +0200, Manlio Perillo wrote: >> Graham Dumpleton ha scritto: >> > [...] >> >> The idea was to allow an application to change its mind about the >> >> headers until it had committed to writing data. That is, to allow >> >> the application to do error handling for as long as possible before >> >> the server has to do it. >> > >> > But once you have called start_response() you cant call it a second >> > time to change the values so how could the application change its >> > mind? >> >> In my implementation of WSGI for nginx, start_response setups the >> headers on the request object, but calls ngx_http_send_header only when >> the first not empty string is yielded. >> >> This means that if an error occurs, the "old" headers are kept in the >> response (and sent to the client); nginx will simply change the status >> code to '500 INTERNAL ERROR'. > > It's not clear to me from this statement whether you're supporting the > exc_info argument as described here: > No, since the current nginx mod_wsgi implementation, as I have already written, only supports the WSGI 2.0 draft. Regards Manlio Perillo From pje at telecommunity.com Thu Oct 4 17:40:08 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 04 Oct 2007 11:40:08 -0400 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <4704FD32.9020604@libero.it> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> Message-ID: <20071004153734.1DFA33A407A@sparrow.telecommunity.com> At 04:48 PM 10/4/2007 +0200, Manlio Perillo wrote: >Phillip J. Eby ha scritto: > > [...] > > > > WSGI 2.0 does not have a start_response() callable in the first place, > > so none of these apply. > > > >I thought that the current WSGI 2.0 draft was only, indeed, a draft. That's correct. But eliminating start_response() and write() is really the main point of *having* a WSGI 2.0. > > It's always the case that a WSGI application can be paused after it > > yields data, even in WSGI 1.0. > >I was not aware of this. >It may cause some problems to a unaware WSGI application the fact that a >new "handler" is started "interleaved" with the previous ones. It may... but the only applications that should be yielding anything are ones that are sending large files, doing server push, or explicitly *desire* to be interleaved in such fashion. If your app isn't in one of those categories, you should just be yielding a single string to begin with. From pje at telecommunity.com Thu Oct 4 17:55:08 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 04 Oct 2007 11:55:08 -0400 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <47050013.3070009@libero.it> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> <4703F2E1.9050402@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> <4704AAE4.1010708@libero.it> <20071004115207.65C463A407A@sparrow.telecommunity.com> <4704F040.10105@libero.it> <20071004143521.58AE53A407A@sparrow.telecommunity.com> <47050013.3070009@libero.it> Message-ID: <20071004155229.3D2303A407B@sparrow.telecommunity.com> At 05:00 PM 10/4/2007 +0200, Manlio Perillo wrote: >Your are making a critical decision here. >You are lowering the level of WSGI to match the level of average WSGI >middlewares programmers. No, we're just getting rid of legacy cruft that's hard to support correctly. There's a big difference. >This can have disastrous conseguences if Python will gain a large user >base in the future (and, of course, with a large user base, the majority >of the users will have a low profile). This seems to be arguing the opposite: making WSGI simpler is a *good* thing if there will be a larger user base. > >> It's very unfortunate that the WSGI implementation in Twisted just uses > >> threads instead of doing some experimentation. > > > > You're making the assumption that no experimentation was done. Check > > the Web-SIG archives from three years ago and see the discussions about > > async APIs. > >No. >I have read a lot of archived messages, and all I have seen are >*discussions* about asynchronous extensions, but no working implementations. Because nobody came up with anything particularly useful. While it's possible to have generic extensions for pausing and resuming iteration, those aren't useful enough to write a fully asynchronous application. You still have to block and/or poll in order to do anything else. Meanwhile, since applications *can* block, they have to be in a separate thread or process from an async server anyway. So all that asynchrony does is free up the thread or process to handle something else... which is wasted if the app is not in an async server. So, barring a radical alteration to the WSGI programming model, asynchronous programming is a bit of a dead-end. To do async right, you really need a CPS (continuation-passing style) API, *and* you also need async APIs for whatever the app is going to *do*. In other words, the absence of standard Python APIs for asynchronous I/O (be it socket, database, or otherwise) make it moot to add an async API to WSGI, since in practice the application will be locked-in to whatever async I/O API it uses. From manlio_perillo at libero.it Thu Oct 4 17:54:47 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Thu, 04 Oct 2007 17:54:47 +0200 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <20071004153734.1DFA33A407A@sparrow.telecommunity.com> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> Message-ID: <47050CC7.9030500@libero.it> Phillip J. Eby ha scritto: > At 04:48 PM 10/4/2007 +0200, Manlio Perillo wrote: >> Phillip J. Eby ha scritto: >> > [...] >> > >> > WSGI 2.0 does not have a start_response() callable in the first place, >> > so none of these apply. >> > >> >> I thought that the current WSGI 2.0 draft was only, indeed, a draft. > > That's correct. But eliminating start_response() and write() is really > the main point of *having* a WSGI 2.0. > For me, what's needs to be elimitated is write() and the exc_info in start_response. > >> > It's always the case that a WSGI application can be paused after it >> > yields data, even in WSGI 1.0. >> >> I was not aware of this. >> It may cause some problems to a unaware WSGI application the fact that a >> new "handler" is started "interleaved" with the previous ones. > > It may... but the only applications that should be yielding anything are > ones that are sending large files, doing server push, or explicitly > *desire* to be interleaved in such fashion. > But they have no way to know if the server supports this, and existing WSGI implementations does not interleave the iteration, as far as I know. > If your app isn't in one of those categories, you should just be > yielding a single string to begin with. Regards Manlio Perillo From manlio_perillo at libero.it Thu Oct 4 18:07:12 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Thu, 04 Oct 2007 18:07:12 +0200 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <20071004155229.3D2303A407B@sparrow.telecommunity.com> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> <4703F2E1.9050402@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> <4704AAE4.1010708@libero.it> <20071004115207.65C463A407A@sparrow.telecommunity.com> <4704F040.10105@libero.it> <20071004143521.58AE53A407A@sparrow.telecommunity.com> <47050013.3070009@libero.it> <20071004155229.3D2303A407B@sparrow.telecommunity.com> Message-ID: <47050FB0.6030202@libero.it> Phillip J. Eby ha scritto: > [...] >> I have read a lot of archived messages, and all I have seen are >> *discussions* about asynchronous extensions, but no working >> implementations. > > Because nobody came up with anything particularly useful. While it's > possible to have generic extensions for pausing and resuming iteration, > those aren't useful enough to write a fully asynchronous application. > You still have to block and/or poll in order to do anything else. > Meanwhile, since applications *can* block, they have to be in a separate > thread or process from an async server anyway. So all that asynchrony > does is free up the thread or process to handle something else... which > is wasted if the app is not in an async server. > For nginx mod_wsgi I'm planning to add support to blocking application,executing them in a thread (*but* there will be only one thread per process, and the entire result will be buffered). Threaded execution will be disabled by default, and can be enabled using an option. To add support to asynchronous WSGI application, I will try to implement the pause_output extension and, more important, I will expose the nginx event API to the WSGI application, writing an extension module. The API will be low level, but once this API will be implemented, it should be possibile to implement a common and standardized API, that will works with nginx mod_wsgi and Twisted. > [...] Regards Manlio Perillo From pje at telecommunity.com Thu Oct 4 18:20:50 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 04 Oct 2007 12:20:50 -0400 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <47050CC7.9030500@libero.it> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> Message-ID: <20071004161810.060183A407A@sparrow.telecommunity.com> At 05:54 PM 10/4/2007 +0200, Manlio Perillo wrote: >Phillip J. Eby ha scritto: > > At 04:48 PM 10/4/2007 +0200, Manlio Perillo wrote: > >> Phillip J. Eby ha scritto: > >> > It's always the case that a WSGI application can be paused after it > >> > yields data, even in WSGI 1.0. > >> > >> I was not aware of this. > >> It may cause some problems to a unaware WSGI application the fact that a > >> new "handler" is started "interleaved" with the previous ones. > > > > It may... but the only applications that should be yielding anything are > > ones that are sending large files, doing server push, or explicitly > > *desire* to be interleaved in such fashion. > > > >But they have no way to know if the server supports this, If it's a WSGI-compliant server, it supports this by definition. It's just that synchronous servers don't pause before requesting the next iteration. > and existing >WSGI implementations does not interleave the iteration, as far as I know. Nothing in the spec stops them from doing so - indeed, they're *encouraged* to do so: http://www.python.org/dev/peps/pep-0333/#middleware-handling-of-block-boundaries """This requirement ensures that asynchronous applications and servers can conspire to reduce the number of threads that are required to run a given number of application instances simultaneously.""" Notice that the only way this sentence works is if you are interleaving applications. That being said, the PEP really needs an explicit discussion of the execution model. From pje at telecommunity.com Thu Oct 4 18:28:56 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 04 Oct 2007 12:28:56 -0400 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <47050FB0.6030202@libero.it> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> <4703F2E1.9050402@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> <4704AAE4.1010708@libero.it> <20071004115207.65C463A407A@sparrow.telecommunity.com> <4704F040.10105@libero.it> <20071004143521.58AE53A407A@sparrow.telecommunity.com> <47050013.3070009@libero.it> <20071004155229.3D2303A407B@sparrow.telecommunity.com> <47050FB0.6030202@libero.it> Message-ID: <20071004162618.433C83A407A@sparrow.telecommunity.com> At 06:07 PM 10/4/2007 +0200, Manlio Perillo wrote: >For nginx mod_wsgi I'm planning to add support to blocking >application,executing them in a thread (*but* there will be only one >thread per process, and the entire result will be buffered). > >Threaded execution will be disabled by default, and can be enabled using >an option. > >To add support to asynchronous WSGI application, I will try to implement >the pause_output extension and, more important, I will expose the nginx >event API to the WSGI application, writing an extension module. > >The API will be low level, but once this API will be implemented, it >should be possibile to implement a common and standardized API, that >will works with nginx mod_wsgi and Twisted. Will this API support async database connections? Async HTTP client operations? If not, then all it would be good for is waiting for the HTTP input stream. And if so, then what's the point? From chrism at plope.com Thu Oct 4 17:55:44 2007 From: chrism at plope.com (Chris McDonough) Date: Thu, 4 Oct 2007 11:55:44 -0400 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <20071004155229.3D2303A407B@sparrow.telecommunity.com> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> <4703F2E1.9050402@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> <4704AAE4.1010708@libero.it> <20071004115207.65C463A407A@sparrow.telecommunity.com> <4704F040.10105@libero.it> <20071004143521.58AE53A407A@sparrow.telecommunity.com> <47050013.3070009@libero.it> <20071004155229.3D2303A407B@sparrow.telecommunity.com> Message-ID: <15F39F0E-5D35-4D0C-AD2A-2B7AAEA35A98@plope.com> On Oct 4, 2007, at 11:55 AM, Phillip J. Eby wrote: > At 05:00 PM 10/4/2007 +0200, Manlio Perillo wrote: >> Your are making a critical decision here. >> You are lowering the level of WSGI to match the level of average WSGI >> middlewares programmers. > > No, we're just getting rid of legacy cruft that's hard to support > correctly. There's a big difference. Getting the start_response dance down and understanding how it plays with middleware is *hard*. Even if we called it something other than WSGI 2.0 (which I don't think we should, because it really is an evolution), returning the three-tuple is the right thing to do. - C From manlio_perillo at libero.it Thu Oct 4 18:37:04 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Thu, 04 Oct 2007 18:37:04 +0200 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <20071004161810.060183A407A@sparrow.telecommunity.com> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> Message-ID: <470516B0.9010605@libero.it> Phillip J. Eby ha scritto: > [...] >> and existing >> WSGI implementations does not interleave the iteration, as far as I know. > > Nothing in the spec stops them from doing so - indeed, they're > *encouraged* to do so: > > http://www.python.org/dev/peps/pep-0333/#middleware-handling-of-block-boundaries > > > """This requirement ensures that asynchronous applications and servers > can conspire to reduce the number of threads that are required to run a > given number of application instances simultaneously.""" > > Notice that the only way this sentence works is if you are interleaving > applications. > What "normal" HTTP servers do is to "pause" the iteration, until the entire buffer has been sent to the client. They can do this, since they run in a dedicated thread or process. With nginx this is not true. nginx mod_wsgi will pause the iteration associated with a given request, but will start a new iteration as soon as a new request arrives, and this in the *same* thread. To make an example (not tested), suppose that a WSGI application keeps a global counter (as a thread specific data). When a new request arrives, the counter is reset to 0, and its value is incremented for every iteration. With all the existing WSGI implementation (as far as I know), we always know the current value of the counter: it will start at 0, reach the number of iterations, and then will start at 0 again. With nginx mod_wsgi this is not true, when a WSGI application set the counter value to, say, 6, and a new request arrives, the application associated with the previous request will now see the value 0, not 6, when it will be unpaused. > That being said, the PEP really needs an explicit discussion of the > execution model. Regards Manlio Perillo From pje at telecommunity.com Thu Oct 4 18:56:12 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 04 Oct 2007 12:56:12 -0400 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <470516B0.9010605@libero.it> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <470516B0.9010605@libero.it> Message-ID: <20071004165333.4D5353A407A@sparrow.telecommunity.com> At 06:37 PM 10/4/2007 +0200, Manlio Perillo wrote: >To make an example (not tested), suppose that a WSGI application keeps a >global counter (as a thread specific data). > >When a new request arrives, the counter is reset to 0, and its value is >incremented for every iteration. > >With all the existing WSGI implementation (as far as I know), we always >know the current value of the counter: it will start at 0, reach the >number of iterations, and then will start at 0 again. So? An application that does this is obviously broken. Again, remember that the WSGI spec encourages interleaving, so any multi-threaded server is well within its rights to do the same thing. There is nothing in WSGI that says multiple simultaneous requests cannot be run in the same thread. Therefore, nothing is guaranteed about what happens to global or thread-local resources while the application (or its returned iterable) is not actually executing. From manlio_perillo at libero.it Thu Oct 4 18:55:45 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Thu, 04 Oct 2007 18:55:45 +0200 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <20071004162618.433C83A407A@sparrow.telecommunity.com> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> <4703F2E1.9050402@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> <4704AAE4.1010708@libero.it> <20071004115207.65C463A407A@sparrow.telecommunity.com> <4704F040.10105@libero.it> <20071004143521.58AE53A407A@sparrow.telecommunity.com> <47050013.3070009@libero.it> <20071004155229.3D2303A407B@sparrow.telecommunity.com> <47050FB0.6030202@libero.it> <20071004162618.433C83A407A@sparrow.telecommunity.com> Message-ID: <47051B11.2020204@libero.it> Phillip J. Eby ha scritto: > At 06:07 PM 10/4/2007 +0200, Manlio Perillo wrote: >> For nginx mod_wsgi I'm planning to add support to blocking >> application,executing them in a thread (*but* there will be only one >> thread per process, and the entire result will be buffered). >> >> Threaded execution will be disabled by default, and can be enabled using >> an option. >> >> To add support to asynchronous WSGI application, I will try to implement >> the pause_output extension and, more important, I will expose the nginx >> event API to the WSGI application, writing an extension module. >> >> The API will be low level, but once this API will be implemented, it >> should be possibile to implement a common and standardized API, that >> will works with nginx mod_wsgi and Twisted. > > Will this API support async database connections? No. Async database connections can be implemented using this API. Using this API we can, as an example, use the asynchronous API already implemented by psycopg2 (but not tested, since no one seems to be interested): import psycopg2 import ngx_reactor def handler(event): if cursor.isready(): resume() conn = psycopg2.connect(database='test') curs = conn.cursor() fileno = curs.fileno() event = ngx_reactor.create_event(fileno, handler, ...) ngx_reactor.add_event(event, NGX_READ_EVENT) resume = environ['wsgi.pause_output']() curs.execute("SELECT * from sleep(%s, 1)", (delay,), async=1) yield '' # Now we have the full response, and we can proceed as in a synchronous # application The real problem here, is the fact that we can not execute new queries until the current query terminates, so we need to implement a query queue. Another big problem is when we want to use a transaction, since we need to execute more then one query. > Async HTTP client > operations? Again, this is will be a low level API. However I think that it should be possible to write an "emulation" of a Twisted reactor, so we can use the protocols implemented in Twisted (but this is a *big* challenge, and I'm not really interested, since if I need to use Twisted protocols, then I will use Twisted Web). > If not, then all it would be good for is waiting for the > HTTP input stream. The current implementation of nginx mod_wsgi already waits until the full request body has been read by Nginx (and the input stream object is an instance of cStringIO or File object, depending on the size of the request body and the value of the client_body_buffer_size option). Nginx does not yet implements input filters. > And if so, then what's the point? Regards Manlio Perillo From manlio_perillo at libero.it Thu Oct 4 18:58:50 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Thu, 04 Oct 2007 18:58:50 +0200 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <20071004165333.4D5353A407A@sparrow.telecommunity.com> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <470516B0.9010605@libero.it> <20071004165333.4D5353A407A@sparrow.telecommunity.com> Message-ID: <47051BCA.7090709@libero.it> Phillip J. Eby ha scritto: > At 06:37 PM 10/4/2007 +0200, Manlio Perillo wrote: >> To make an example (not tested), suppose that a WSGI application keeps a >> global counter (as a thread specific data). >> >> When a new request arrives, the counter is reset to 0, and its value is >> incremented for every iteration. >> >> With all the existing WSGI implementation (as far as I know), we always >> know the current value of the counter: it will start at 0, reach the >> number of iterations, and then will start at 0 again. > > So? An application that does this is obviously broken. Again, remember > that the WSGI spec encourages interleaving, so any multi-threaded server > is well within its rights to do the same thing. > > There is nothing in WSGI that says multiple simultaneous requests cannot > be run in the same thread. Therefore, nothing is guaranteed about what > happens to global or thread-local resources while the application (or > its returned iterable) is not actually executing. Ok. But why you are against adding a new environ value (not necessary named wsgi.asynchronous), that will explicitly state if the WSGI server will interleave the WSGI application? Regards Manlio Perillo From pje at telecommunity.com Thu Oct 4 19:47:52 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 04 Oct 2007 13:47:52 -0400 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <47051BCA.7090709@libero.it> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <470516B0.9010605@libero.it> <20071004165333.4D5353A407A@sparrow.telecommunity.com> <47051BCA.7090709@libero.it> Message-ID: <20071004174513.A4F0F3A407A@sparrow.telecommunity.com> At 06:58 PM 10/4/2007 +0200, Manlio Perillo wrote: >But why you are against adding a new environ value (not necessary named >wsgi.asynchronous), that will explicitly state if the WSGI server will >interleave the WSGI application? Why do you think it's useful? From manlio_perillo at libero.it Thu Oct 4 19:53:45 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Thu, 04 Oct 2007 19:53:45 +0200 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <20071004174513.A4F0F3A407A@sparrow.telecommunity.com> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <470516B0.9010605@libero.it> <20071004165333.4D5353A407A@sparrow.telecommunity.com> <47051BCA.7090709@libero.it> <20071004174513.A4F0F3A407A@sparrow.telecommunity.com> Message-ID: <470528A9.3050108@libero.it> Phillip J. Eby ha scritto: > At 06:58 PM 10/4/2007 +0200, Manlio Perillo wrote: >> But why you are against adding a new environ value (not necessary named >> wsgi.asynchronous), that will explicitly state if the WSGI server will >> interleave the WSGI application? > > Why do you think it's useful? For the same reason you think wsgi.multiprocess is useful. Its an informative information; maybe it is not really useful, but it describe how the WSGI server works. Regards Manlio Perillo From MDiPierro at cti.depaul.edu Thu Oct 4 20:29:01 2007 From: MDiPierro at cti.depaul.edu (DiPierro, Massimo) Date: Thu, 4 Oct 2007 13:29:01 -0500 Subject: [Web-SIG] NOOO! Another web framework Message-ID: hello everybody... please do not shoot me! I know you don't think you need a new web framework but please give me the benefit of the doubt (I teach a class on Web Frameworks at DePaul University): http://mdp.cti.depaul.edu/examples Why? here are some unique features: 1) full web based development, deployment and management of applications, no more shell commands (unless you want them) 2) built-in ticketing system to report bugs to administrator (not to the users, ever) 3) can compile applciations to byte-code for speed and distribution in closed source (some people want this) 4) 100% python (including template language). 5) no installation or configuration required. Just download and click. (includes python, web server, sqlite3, administrative interface and examples) 6) everything has a default: you write the model, you get an administrative interface; you write a controller, you get a generic view; etc. 7) The API are stable and there is no plan for a change. It shares with Django and Turbogears some features: model-view-controller design, form generators and validation, internationalization, ORM, although all code has been written from scratch. Here is an example application, a CMS to manage groups (members, wikis, blogs, votes, minutes, documents): https://mdp.cti.depaul.edu/groups Massimo From graham.dumpleton at gmail.com Fri Oct 5 01:03:40 2007 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Fri, 5 Oct 2007 09:03:40 +1000 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <20071004130818.BFCE83A407A@sparrow.telecommunity.com> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> <4703F2E1.9050402@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> <88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com> <20071004114441.C7B103A407A@sparrow.telecommunity.com> <88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com> <20071004130818.BFCE83A407A@sparrow.telecommunity.com> Message-ID: <88e286470710041603p309e8313pe0279342088894bf@mail.gmail.com> On 04/10/2007, Phillip J. Eby wrote: > >But once you have called start_response() you cant call it a second > >time to change the values > > You can, as long as you pass in the exception info -- because an > exception is the only legitimate reason to change the values. Okay, forgot about that case. Luckily my code appears in the main to do the correct thing, although I'll need to check a few corner cases as looks like the traceback I log when start_response() called with exception after data written is a bit wrong as doesn't identify the original exception type correctly. This may just be an issue with how I log exception details from C API. The yielding of empty strings prior to calling start_response() with exception details also gives me strife as appear not to return any response to client at all, ie., no headers or body. So, little bit of tweaking to do. > > > Only problem is that the PEP examples and wsgiref aren't written to > > > support doing it that way, so I don't think we can reasonably change > > > it in WSGI 1.0, and in 2.0 it won't even matter. > > > >Huh, change what in WSGI 1.0. As you seem to note the CGI example in > >the PEP does flush headers even if first data block was an empty > >string > > Actually, the PEP example skips empty strings yielded by the > app_iter. wsgiref.handlers, OTOH, doesn't do this, now that I've checked it. True again. I was only looking at the internals of write() and so missed that iteration would eliminate empty strings. > Yep, but another argument in favor of getting rid of as much > statefulness from the protocol as we can. Making the status and > headers part of the return value entirely eliminates the question of > when they're going to get written, and whether they can be changed. > > (As a side benefit, making the return a 3-tuple makes it impossible > to write a WSGI app using a single generator -- thereby discouraging > people from using 'yield' like it was a CGI "print".) Too early for me to be thinking straight and work it out for myself, but does this all help in making it simpler or more obvious what the cleanup requirements are for a generator. Ie., correct use of try/except/finally around yield and purpose of close() function. I've seen a number of people not get this correct in stuff and tried to correct them. Hopefully I have captured what should be done correctly in my document: http://code.google.com/p/modwsgi/wiki/RegisteringCleanupCode If I haven't please let me know. :-) Graham From pje at telecommunity.com Fri Oct 5 02:22:24 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 04 Oct 2007 20:22:24 -0400 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <88e286470710041603p309e8313pe0279342088894bf@mail.gmail.co m> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> <4703F2E1.9050402@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> <88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com> <20071004114441.C7B103A407A@sparrow.telecommunity.com> <88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com> <20071004130818.BFCE83A407A@sparrow.telecommunity.com> <88e286470710041603p309e8313pe0279342088894bf@mail.gmail.com> Message-ID: <20071005001945.39CD13A407A@sparrow.telecommunity.com> At 09:03 AM 10/5/2007 +1000, Graham Dumpleton wrote: >Too early for me to be thinking straight and work it out for myself, >but does this all help in making it simpler or more obvious what the >cleanup requirements are for a generator. Ie., correct use of >try/except/finally around yield and purpose of close() function. I've >seen a number of people not get this correct in stuff and tried to >correct them. Hopefully I have captured what should be done correctly >in my document: > > http://code.google.com/p/modwsgi/wiki/RegisteringCleanupCode That's fine, and none of it would change for WSGI 2.0, except minor details of what wraps what. Note, by the way, that as of Python 2.5, a generator can have try/finally and its close() method will be called when it finishes or is garbage collected. So an app_iter implemented as a generator under 2.5 can just use with: or try/finally to handle cleanup -- and that applies equally to WSGI 1 and 2. From pje at telecommunity.com Fri Oct 5 02:27:02 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu, 04 Oct 2007 20:27:02 -0400 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <470528A9.3050108@libero.it> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <470516B0.9010605@libero.it> <20071004165333.4D5353A407A@sparrow.telecommunity.com> <47051BCA.7090709@libero.it> <20071004174513.A4F0F3A407A@sparrow.telecommunity.com> <470528A9.3050108@libero.it> Message-ID: <20071005002423.320413A407A@sparrow.telecommunity.com> At 07:53 PM 10/4/2007 +0200, Manlio Perillo wrote: >Phillip J. Eby ha scritto: > > At 06:58 PM 10/4/2007 +0200, Manlio Perillo wrote: > >> But why you are against adding a new environ value (not necessary named > >> wsgi.asynchronous), that will explicitly state if the WSGI server will > >> interleave the WSGI application? > > > > Why do you think it's useful? > >For the same reason you think wsgi.multiprocess is useful. Actually, I don't think it's all that useful; IIRC, it was added as a compromise to the spec, to fend off a proposal for a more complex server-capabilities API. :) Also, there's an important difference between your proposed addition and the multiprocess/multithread flags, which is that there existed frameworks that could be ported to WSGI that only supported one model or the other. I.e., frameworks that could only run multi-threaded, or only multi-process. In other words, those flags were to support legacy frameworks detecting that they were in an incompatible hosting environment. However, IIUC, there is no such existing framework that could meaningfully use the flag you're proposing, that has any real chance of being portable to different WSGI environments. From graham.dumpleton at gmail.com Fri Oct 5 02:31:13 2007 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Fri, 5 Oct 2007 10:31:13 +1000 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <20071005001945.39CD13A407A@sparrow.telecommunity.com> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> <4703F2E1.9050402@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> <88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com> <20071004114441.C7B103A407A@sparrow.telecommunity.com> <88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com> <20071004130818.BFCE83A407A@sparrow.telecommunity.com> <88e286470710041603p309e8313pe0279342088894bf@mail.gmail.com> <20071005001945.39CD13A407A@sparrow.telecommunity.com> Message-ID: <88e286470710041731n2d0c115fr145f27adf2a55000@mail.gmail.com> On 05/10/2007, Phillip J. Eby wrote: > At 09:03 AM 10/5/2007 +1000, Graham Dumpleton wrote: > >Too early for me to be thinking straight and work it out for myself, > >but does this all help in making it simpler or more obvious what the > >cleanup requirements are for a generator. Ie., correct use of > >try/except/finally around yield and purpose of close() function. I've > >seen a number of people not get this correct in stuff and tried to > >correct them. Hopefully I have captured what should be done correctly > >in my document: > > > > http://code.google.com/p/modwsgi/wiki/RegisteringCleanupCode > > That's fine, and none of it would change for WSGI 2.0, except minor > details of what wraps what. > > Note, by the way, that as of Python 2.5, a generator can have > try/finally and its close() method will be called when it finishes or > is garbage collected. So an app_iter implemented as a generator > under 2.5 can just use with: or try/finally to handle cleanup -- and > that applies equally to WSGI 1 and 2. Yep, know about the Python 2.5 difference. Didn't want to talk about it though so that people would just use the way that would also work with older versions of Python. BTW, have been thinking about doing it for a long time, but truly wasn't sure that WSGI 2.0 would ever actually happen, but now that discussion is happening again I will add to Apache mod_wsgi a directive WSGIProtocolVersion which would allow experimental 2.0 implementation to be switched on for specific applications. Having this and perhaps other experimental implementations may help to flush out any issues when we start discussing details, especially as Apache imposes its own quirks that others tend not to have to deal with. Adding this support in should be quite trivial. Once that is done and the discussion about asynchronous implementations dies down, might initiate discussions about some other issues such as wsgi.input, end of input indicators and content length issues for streamed request content and mutating input filters etc. Graham From graham.dumpleton at gmail.com Fri Oct 5 02:41:09 2007 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Fri, 5 Oct 2007 10:41:09 +1000 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <20071005002423.320413A407A@sparrow.telecommunity.com> References: <4704222D.30208@colorstudy.com> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <470516B0.9010605@libero.it> <20071004165333.4D5353A407A@sparrow.telecommunity.com> <47051BCA.7090709@libero.it> <20071004174513.A4F0F3A407A@sparrow.telecommunity.com> <470528A9.3050108@libero.it> <20071005002423.320413A407A@sparrow.telecommunity.com> Message-ID: <88e286470710041741k55bdf059p95f0229bfb36c262@mail.gmail.com> On 05/10/2007, Phillip J. Eby wrote: > At 07:53 PM 10/4/2007 +0200, Manlio Perillo wrote: > >Phillip J. Eby ha scritto: > > > At 06:58 PM 10/4/2007 +0200, Manlio Perillo wrote: > > >> But why you are against adding a new environ value (not necessary named > > >> wsgi.asynchronous), that will explicitly state if the WSGI server will > > >> interleave the WSGI application? > > > > > > Why do you think it's useful? > > > >For the same reason you think wsgi.multiprocess is useful. > > Actually, I don't think it's all that useful; IIRC, it was added as a > compromise to the spec, to fend off a proposal for a more complex > server-capabilities API. :) > > Also, there's an important difference between your proposed addition > and the multiprocess/multithread flags, which is that there existed > frameworks that could be ported to WSGI that only supported one model > or the other. I.e., frameworks that could only run multi-threaded, > or only multi-process. FWIW, one example where the flags are useful is in WSGI components such as browser based debuggers such as EvalException as they could disable themselves or flag an error when used in a multiprocess web server where there would be no guarantee that a subsequent request would end up back at the same process. Graham From ianb at colorstudy.com Fri Oct 5 02:43:32 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 04 Oct 2007 20:43:32 -0400 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <88e286470710041741k55bdf059p95f0229bfb36c262@mail.gmail.com> References: <4704222D.30208@colorstudy.com> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <470516B0.9010605@libero.it> <20071004165333.4D5353A407A@sparrow.telecommunity.com> <47051BCA.7090709@libero.it> <20071004174513.A4F0F3A407A@sparrow.telecommunity.com> <470528A9.3050108@libero.it> <20071005002423.320413A407A@sparrow.telecommunity.com> <88e286470710041741k55bdf059p95f0229bfb36c262@mail.gmail.com> Message-ID: <470588B4.3060108@colorstudy.com> Graham Dumpleton wrote: > On 05/10/2007, Phillip J. Eby wrote: >> At 07:53 PM 10/4/2007 +0200, Manlio Perillo wrote: >>> Phillip J. Eby ha scritto: >>>> At 06:58 PM 10/4/2007 +0200, Manlio Perillo wrote: >>>>> But why you are against adding a new environ value (not necessary named >>>>> wsgi.asynchronous), that will explicitly state if the WSGI server will >>>>> interleave the WSGI application? >>>> Why do you think it's useful? >>> For the same reason you think wsgi.multiprocess is useful. >> Actually, I don't think it's all that useful; IIRC, it was added as a >> compromise to the spec, to fend off a proposal for a more complex >> server-capabilities API. :) >> >> Also, there's an important difference between your proposed addition >> and the multiprocess/multithread flags, which is that there existed >> frameworks that could be ported to WSGI that only supported one model >> or the other. I.e., frameworks that could only run multi-threaded, >> or only multi-process. > > FWIW, one example where the flags are useful is in WSGI components > such as browser based debuggers such as EvalException as they could > disable themselves or flag an error when used in a multiprocess web > server where there would be no guarantee that a subsequent request > would end up back at the same process. Yeah, there's several things I pushed for in WSGI that I didn't really end up needing or wanting later. But wsgi.multiprocess and wsgi.multithread have been useful; certainly enough to warrant their simplicity. Ian From chris at simplistix.co.uk Fri Oct 5 09:04:59 2007 From: chris at simplistix.co.uk (Chris Withers) Date: Fri, 05 Oct 2007 08:04:59 +0100 Subject: [Web-SIG] NOOO! Another web framework In-Reply-To: References: Message-ID: <4705E21B.4050902@simplistix.co.uk> DiPierro, Massimo wrote: > here are some unique features: > 1) full web based development, deployment and management of applications, no more shell commands (unless you want them) Good. Zope seems to have moved away from this, which is a shame... > 2) built-in ticketing system to report bugs to administrator (not to the users, ever) Nice :-) (although the users do see some kid of page saying "sorry, something went wrong, right?) > 3) can compile applciations to byte-code for speed and distribution in closed source (some people want this) You do know it takes about 2 minutes to turn a .pyc back into a .py, right? > 5) no installation or configuration required. Just download and click. (includes python, web server, sqlite3, administrative interface and examples) Cool, although you will need to cater for proper deployments if things go well... cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk From manlio_perillo at libero.it Fri Oct 5 12:36:32 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Fri, 05 Oct 2007 12:36:32 +0200 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <20071004130818.BFCE83A407A@sparrow.telecommunity.com> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> <4703F2E1.9050402@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> <88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com> <20071004114441.C7B103A407A@sparrow.telecommunity.com> <88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com> <20071004130818.BFCE83A407A@sparrow.telecommunity.com> Message-ID: <470613B0.8000101@libero.it> Phillip J. Eby ha scritto: > [...] > Yep, but another argument in favor of getting rid of as much > statefulness from the protocol as we can. Making the status and headers > part of the return value entirely eliminates the question of when > they're going to get written, and whether they can be changed. > > (As a side benefit, making the return a 3-tuple makes it impossible to > write a WSGI app using a single generator -- thereby discouraging people > from using 'yield' like it was a CGI "print".) > Wait, what do you mean by """As a side benefit, making the return a 3-tuple makes it impossible to write a WSGI app using a single generator"""? And what do you mean by """getting rid of as much statefulness from the protocol as we can"""? Regards Manlio Perillo From manlio_perillo at libero.it Fri Oct 5 12:41:14 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Fri, 05 Oct 2007 12:41:14 +0200 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <20071005002423.320413A407A@sparrow.telecommunity.com> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <470516B0.9010605@libero.it> <20071004165333.4D5353A407A@sparrow.telecommunity.com> <47051BCA.7090709@libero.it> <20071004174513.A4F0F3A407A@sparrow.telecommunity.com> <470528A9.3050108@libero.it> <20071005002423.320413A407A@sparrow.telecommunity.com> Message-ID: <470614CA.8000300@libero.it> Phillip J. Eby ha scritto: > At 07:53 PM 10/4/2007 +0200, Manlio Perillo wrote: >> Phillip J. Eby ha scritto: >> > At 06:58 PM 10/4/2007 +0200, Manlio Perillo wrote: >> >> But why you are against adding a new environ value (not necessary >> named >> >> wsgi.asynchronous), that will explicitly state if the WSGI server will >> >> interleave the WSGI application? >> > >> > Why do you think it's useful? >> >> For the same reason you think wsgi.multiprocess is useful. > > Actually, I don't think it's all that useful; IIRC, it was added as a > compromise to the spec, to fend off a proposal for a more complex > server-capabilities API. :) > Ok. > Also, there's an important difference between your proposed addition and > the multiprocess/multithread flags, which is that there existed > frameworks that could be ported to WSGI that only supported one model or > the other. I.e., frameworks that could only run multi-threaded, or only > multi-process. > > In other words, those flags were to support legacy frameworks detecting > that they were in an incompatible hosting environment. However, IIUC, > there is no such existing framework that could meaningfully use the flag > you're proposing, that has any real chance of being portable to > different WSGI environments. This is true, but I continue to think that it is worth adding that flag. Asynchronous support is available in Nginx mod_wsgi, and in the future someone can implement a WSGI gateway for lighttpd. Regards Manlio Perillo From pje at telecommunity.com Fri Oct 5 16:33:39 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 05 Oct 2007 10:33:39 -0400 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <470613B0.8000101@libero.it> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> <4703F2E1.9050402@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> <88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com> <20071004114441.C7B103A407A@sparrow.telecommunity.com> <88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com> <20071004130818.BFCE83A407A@sparrow.telecommunity.com> <470613B0.8000101@libero.it> Message-ID: <20071005143100.07AD63A407C@sparrow.telecommunity.com> At 12:36 PM 10/5/2007 +0200, Manlio Perillo wrote: >Phillip J. Eby ha scritto: > > [...] > > Yep, but another argument in favor of getting rid of as much > > statefulness from the protocol as we can. Making the status and headers > > part of the return value entirely eliminates the question of when > > they're going to get written, and whether they can be changed. > > > > (As a side benefit, making the return a 3-tuple makes it impossible to > > write a WSGI app using a single generator -- thereby discouraging people > > from using 'yield' like it was a CGI "print".) > > > > >Wait, what do you mean by """As a side benefit, making the return a >3-tuple makes it impossible to write a WSGI app using a single generator"""? I mean that you can't write a WSGI 2.0 application using a single generator function, because it has to return a tuple, not an iterator. This will discourage people from thinking "yield" is a good way to build up their output, instead of using a StringIO or ''.join() on a list of strings. >And what do you mean by """getting rid of as much >statefulness from the protocol as we can"""? Most of WSGI 1.0's complexity comes from the sequence of operations - when you call start_response(), whether you can call it again, whether iteration is in progress, etc. WSGI 2.0 gives all the sequence control to the caller, so that there is no delicate dance of calls back and forth. This especially simplifies middleware that manipulates the output stream, because it doesn't need to wrap start_response() and write(). From pje at telecommunity.com Fri Oct 5 16:36:36 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 05 Oct 2007 10:36:36 -0400 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <470614CA.8000300@libero.it> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <470516B0.9010605@libero.it> <20071004165333.4D5353A407A@sparrow.telecommunity.com> <47051BCA.7090709@libero.it> <20071004174513.A4F0F3A407A@sparrow.telecommunity.com> <470528A9.3050108@libero.it> <20071005002423.320413A407A@sparrow.telecommunity.com> <470614CA.8000300@libero.it> Message-ID: <20071005143356.B8B7D3A407C@sparrow.telecommunity.com> At 12:41 PM 10/5/2007 +0200, Manlio Perillo wrote: >Phillip J. Eby ha scritto: > > In other words, those flags were to support legacy frameworks detecting > > that they were in an incompatible hosting environment. However, IIUC, > > there is no such existing framework that could meaningfully use the flag > > you're proposing, that has any real chance of being portable to > > different WSGI environments. > >This is true, but I continue to think that it is worth adding that flag. >Asynchronous support is available in Nginx mod_wsgi, and in the future >someone can implement a WSGI gateway for lighttpd. Right now, the definition of the flag is not sufficiently defined for my taste. You have only proposed that it be set to indicate that interleaved execution is possible -- but it is *always* possible to have interleaved execution in WSGI 1.0, so the only reason to add the flag to WSGI 2.0 would be so a server could promise NOT to interleave execution. And what good is that? From roberto at dealmeida.net Fri Oct 5 16:57:43 2007 From: roberto at dealmeida.net (Rob De Almeida) Date: Fri, 05 Oct 2007 11:57:43 -0300 Subject: [Web-SIG] yield considered harmful (was: x-wsgiorg.flush) In-Reply-To: <20071005143100.07AD63A407C@sparrow.telecommunity.com> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> <4703F2E1.9050402@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> <88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com> <20071004114441.C7B103A407A@sparrow.telecommunity.com> <88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com> <20071004130818.BFCE83A407A@sparrow.telecommunity.com> <470613B0.8000101@libero.it> <20071005143100.07AD63A407C@sparrow.telecommunity.com> Message-ID: <470650E7.4050809@dealmeida.net> Phillip J. Eby wrote: > I mean that you can't write a WSGI 2.0 application using a single > generator function, because it has to return a tuple, not an > iterator. This will discourage people from thinking "yield" is a > good way to build up their output, instead of using a StringIO or > ''.join() on a list of strings. Could you explain why using 'yield' is not recommended? Just curious, because I use it all the time. Thanks, --Rob From manlio_perillo at libero.it Fri Oct 5 17:14:02 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Fri, 05 Oct 2007 17:14:02 +0200 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <20071005143356.B8B7D3A407C@sparrow.telecommunity.com> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <470516B0.9010605@libero.it> <20071004165333.4D5353A407A@sparrow.telecommunity.com> <47051BCA.7090709@libero.it> <20071004174513.A4F0F3A407A@sparrow.telecommunity.com> <470528A9.3050108@libero.it> <20071005002423.320413A407A@sparrow.telecommunity.com> <470614CA.8000300@libero.it> <20071005143356.B8B7D3A407C@sparrow.telecommunity.com> Message-ID: <470654BA.9050100@libero.it> Phillip J. Eby ha scritto: > At 12:41 PM 10/5/2007 +0200, Manlio Perillo wrote: >> Phillip J. Eby ha scritto: >> > In other words, those flags were to support legacy frameworks detecting >> > that they were in an incompatible hosting environment. However, IIUC, >> > there is no such existing framework that could meaningfully use the >> flag >> > you're proposing, that has any real chance of being portable to >> > different WSGI environments. >> >> This is true, but I continue to think that it is worth adding that flag. >> Asynchronous support is available in Nginx mod_wsgi, and in the future >> someone can implement a WSGI gateway for lighttpd. > > Right now, the definition of the flag is not sufficiently defined for my > taste. You have only proposed that it be set to indicate that > interleaved execution is possible -- but it is *always* possible to have > interleaved execution in WSGI 1.0, so the only reason to add the flag to > WSGI 2.0 would be so a server could promise NOT to interleave > execution. And what good is that? > Ok, here is more useful definition. If wsgi.asynchronous evaluates to true, then the WSGI application *will* be executed into the server main process cycle and thus the application execution *will* be interleaved (since this is the only way to support multiple concurrent requests). Regards Manlio Perillo From ianb at colorstudy.com Fri Oct 5 17:16:10 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Fri, 05 Oct 2007 11:16:10 -0400 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <470654BA.9050100@libero.it> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <470516B0.9010605@libero.it> <20071004165333.4D5353A407A@sparrow.telecommunity.com> <47051BCA.7090709@libero.it> <20071004174513.A4F0F3A407A@sparrow.telecommunity.com> <470528A9.3050108@libero.it> <20071005002423.320413A407A@sparrow.telecommunity.com> <470614CA.8000300@libero.it> <20071005143356.B8B7D3A407C@sparrow.telecommunity.com> <470654BA.9050100@libero.it> Message-ID: <4706553A.3080603@colorstudy.com> Manlio Perillo wrote: > Phillip J. Eby ha scritto: >> At 12:41 PM 10/5/2007 +0200, Manlio Perillo wrote: >>> Phillip J. Eby ha scritto: >>>> In other words, those flags were to support legacy frameworks detecting >>>> that they were in an incompatible hosting environment. However, IIUC, >>>> there is no such existing framework that could meaningfully use the >>> flag >>>> you're proposing, that has any real chance of being portable to >>>> different WSGI environments. >>> This is true, but I continue to think that it is worth adding that flag. >>> Asynchronous support is available in Nginx mod_wsgi, and in the future >>> someone can implement a WSGI gateway for lighttpd. >> Right now, the definition of the flag is not sufficiently defined for my >> taste. You have only proposed that it be set to indicate that >> interleaved execution is possible -- but it is *always* possible to have >> interleaved execution in WSGI 1.0, so the only reason to add the flag to >> WSGI 2.0 would be so a server could promise NOT to interleave >> execution. And what good is that? >> > > Ok, here is more useful definition. > > If wsgi.asynchronous evaluates to true, then the WSGI application *will* > be executed into the server main process cycle and thus the application > execution *will* be interleaved (since this is the only way to support > multiple concurrent requests). Isn't the more important distinction that the application must not block? Kind of like wsgi.multithread means the application must be threadsafe. Ian From manlio_perillo at libero.it Fri Oct 5 17:34:05 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Fri, 05 Oct 2007 17:34:05 +0200 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <4706553A.3080603@colorstudy.com> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <470516B0.9010605@libero.it> <20071004165333.4D5353A407A@sparrow.telecommunity.com> <47051BCA.7090709@libero.it> <20071004174513.A4F0F3A407A@sparrow.telecommunity.com> <470528A9.3050108@libero.it> <20071005002423.320413A407A@sparrow.telecommunity.com> <470614CA.8000300@libero.it> <20071005143356.B8B7D3A407C@sparrow.telecommunity.com> <470654BA.9050100@libero.it> <4706553A.3080603@colorstudy.com> Message-ID: <4706596D.4040000@libero.it> Ian Bicking ha scritto: > [...] >> Ok, here is more useful definition. >> >> If wsgi.asynchronous evaluates to true, then the WSGI application >> *will* be executed into the server main process cycle and thus the >> application execution *will* be interleaved (since this is the only >> way to support multiple concurrent requests). > > Isn't the more important distinction that the application must not > block? Kind of like wsgi.multithread means the application must be > threadsafe. > Right, but I assume that this is evident when I say "executed into the server main process cycle". An interesting example is an application that will read some data from a source (as an example from a video capture device) and will send the output to the web. The application can blocks when reading, but as soon as it will yield some data, the server can interleave calls to it. This means that the WSGI application can not use a "global" handle to the video capture device, or use thread specific data. It must be able to store the device handle on a per request "context". This is the reason why I'm writing a spec for a `wsgi.context_id` extension, that will return a request specific identifier (in the same way as it is done by os.getpid or thread.get_ident) > Ian > Regards Manlio Perillo From manlio_perillo at libero.it Fri Oct 5 17:47:26 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Fri, 05 Oct 2007 17:47:26 +0200 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <4706596D.4040000@libero.it> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <470516B0.9010605@libero.it> <20071004165333.4D5353A407A@sparrow.telecommunity.com> <47051BCA.7090709@libero.it> <20071004174513.A4F0F3A407A@sparrow.telecommunity.com> <470528A9.3050108@libero.it> <20071005002423.320413A407A@sparrow.telecommunity.com> <470614CA.8000300@libero.it> <20071005143356.B8B7D3A407C@sparrow.telecommunity.com> <470654BA.9050100@libero.it> <4706553A.3080603@colorstudy.com> <4706596D.4040000@libero.it> Message-ID: <47065C8E.1010703@libero.it> Manlio Perillo ha scritto: > Ian Bicking ha scritto: >> [...] >>> Ok, here is more useful definition. >>> >>> If wsgi.asynchronous evaluates to true, then the WSGI application >>> *will* be executed into the server main process cycle and thus the >>> application execution *will* be interleaved (since this is the only >>> way to support multiple concurrent requests). >> Isn't the more important distinction that the application must not >> block? Kind of like wsgi.multithread means the application must be >> threadsafe. >> > > Right, but I assume that this is evident when I say "executed into the > server main process cycle". > > An interesting example is an application that will read some data from a > source (as an example from a video capture device) and will send the > output to the web. > Forget what I have written. A request specific context is already supplied by the wsgi application callable context. I'm tring to understand if an explicit request context is necessary for some other kind of applications. Regards Manlio Perillo From robinbryce at gmail.com Fri Oct 5 18:34:23 2007 From: robinbryce at gmail.com (Robin Bryce) Date: Fri, 5 Oct 2007 17:34:23 +0100 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <20071004161810.060183A407A@sparrow.telecommunity.com> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> Message-ID: On 04/10/2007, Phillip J. Eby wrote: > At 05:54 PM 10/4/2007 +0200, Manlio Perillo wrote: > >Phillip J. Eby ha scritto: > > > At 04:48 PM 10/4/2007 +0200, Manlio Perillo wrote: > > >> Phillip J. Eby ha scritto: > > >> > It's always the case that a WSGI application can be paused after it > > >> > yields data, even in WSGI 1.0. > > >> > > >> I was not aware of this. > > >> It may cause some problems to a unaware WSGI application the fact that a > > >> new "handler" is started "interleaved" with the previous ones. > > > > > > It may... but the only applications that should be yielding anything are > > > ones that are sending large files, doing server push, or explicitly > > > *desire* to be interleaved in such fashion. > > > > > > >But they have no way to know if the server supports this, > > If it's a WSGI-compliant server, it supports this by > definition. It's just that synchronous servers don't pause before > requesting the next iteration. > > > > and existing > >WSGI implementations does not interleave the iteration, as far as I know. > > Nothing in the spec stops them from doing so - indeed, they're > *encouraged* to do so: > > http://www.python.org/dev/peps/pep-0333/#middleware-handling-of-block-boundaries > > """This requirement ensures that asynchronous applications and > servers can conspire to reduce the number of threads that are > required to run a given number of application instances simultaneously.""" > > Notice that the only way this sentence works is if you are > interleaving applications. > > That being said, the PEP really needs an explicit discussion of the > execution model. Is there a means to support a non blocking read on wsgi.input ? Eg., for data in environ['wsgi.input']: if not data: if nothing_else_to_do: yield environ['wsgi.input'] # Wake me when there is more data else: do_domething() yield '' # wake me next time arround, irrespective of whether there is data From pje at telecommunity.com Fri Oct 5 19:02:04 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 05 Oct 2007 13:02:04 -0400 Subject: [Web-SIG] yield considered harmful (was: x-wsgiorg.flush) In-Reply-To: <470650E7.4050809@dealmeida.net> References: <4703ADE1.5040507@libero.it> <20071003192817.3014C3A407A@sparrow.telecommunity.com> <4703F2E1.9050402@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> <88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com> <20071004114441.C7B103A407A@sparrow.telecommunity.com> <88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com> <20071004130818.BFCE83A407A@sparrow.telecommunity.com> <470613B0.8000101@libero.it> <20071005143100.07AD63A407C@sparrow.telecommunity.com> <470650E7.4050809@dealmeida.net> Message-ID: <20071005165925.0E21A3A407B@sparrow.telecommunity.com> At 11:57 AM 10/5/2007 -0300, Rob De Almeida wrote: >Phillip J. Eby wrote: >>I mean that you can't write a WSGI 2.0 application using a single >>generator function, because it has to return a tuple, not an >>iterator. This will discourage people from thinking "yield" is a >>good way to build up their output, instead of using a StringIO or >>''.join() on a list of strings. > >Could you explain why using 'yield' is not recommended? Just >curious, because I use it all the time. Because you're slowing down your application's throughput. The only reasons to yield multiple strings is when you are either: 1. Sending a file that's larger than you want to load into memory, or 2. You're doing "server push" and need to do some processing between payloads. From pje at telecommunity.com Fri Oct 5 19:31:18 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 05 Oct 2007 13:31:18 -0400 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <470654BA.9050100@libero.it> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <470516B0.9010605@libero.it> <20071004165333.4D5353A407A@sparrow.telecommunity.com> <47051BCA.7090709@libero.it> <20071004174513.A4F0F3A407A@sparrow.telecommunity.com> <470528A9.3050108@libero.it> <20071005002423.320413A407A@sparrow.telecommunity.com> <470614CA.8000300@libero.it> <20071005143356.B8B7D3A407C@sparrow.telecommunity.com> <470654BA.9050100@libero.it> Message-ID: <20071005172839.7C4853A407B@sparrow.telecommunity.com> At 05:14 PM 10/5/2007 +0200, Manlio Perillo wrote: >Phillip J. Eby ha scritto: > > At 12:41 PM 10/5/2007 +0200, Manlio Perillo wrote: > >> Phillip J. Eby ha scritto: > >> > In other words, those flags were to support legacy frameworks detecting > >> > that they were in an incompatible hosting environment. However, IIUC, > >> > there is no such existing framework that could meaningfully use the > >> flag > >> > you're proposing, that has any real chance of being portable to > >> > different WSGI environments. > >> > >> This is true, but I continue to think that it is worth adding that flag. > >> Asynchronous support is available in Nginx mod_wsgi, and in the future > >> someone can implement a WSGI gateway for lighttpd. > > > > Right now, the definition of the flag is not sufficiently defined for my > > taste. You have only proposed that it be set to indicate that > > interleaved execution is possible -- but it is *always* possible to have > > interleaved execution in WSGI 1.0, so the only reason to add the flag to > > WSGI 2.0 would be so a server could promise NOT to interleave > > execution. And what good is that? > > > >Ok, here is more useful definition. > >If wsgi.asynchronous evaluates to true, then the WSGI application *will* >be executed into the server main process cycle and thus the application >execution *will* be interleaved (since this is the only way to support >multiple concurrent requests). I still don't see how this is *useful*. What will the application *do* with this information? From pje at telecommunity.com Fri Oct 5 19:35:33 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 05 Oct 2007 13:35:33 -0400 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> Message-ID: <20071005173253.86D293A407B@sparrow.telecommunity.com> At 05:34 PM 10/5/2007 +0100, Robin Bryce wrote: >Is there a means to support a non blocking read on wsgi.input ? No. Some ideas have been proposed, but nobody has shown a practical scenario where it is useful. For it to be useful, you would have to have an asynchronous server that is interleaving in its main thread, and therefore requires applications to be non-blocking. However, to run "normal" WSGI applications, such a server has to *allow* them to block, so it is going to have to run them in a different thread anyway. This is why the whole idea of creating an async *variant* of WSGI is moot - an async WSGI protocol is essentially 100% incompatible with synchronous WSGI, since any async WSGI components can't use synchronous WSGI components, unless they spawn another thread or process. The whole thing is an exercise in futility, until/unless there is more than one such server and application, at which point they could get together and create AWSGI or WSGI-A or something of that sort. From manlio_perillo at libero.it Fri Oct 5 19:38:00 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Fri, 05 Oct 2007 19:38:00 +0200 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <20071005172839.7C4853A407B@sparrow.telecommunity.com> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <470516B0.9010605@libero.it> <20071004165333.4D5353A407A@sparrow.telecommunity.com> <47051BCA.7090709@libero.it> <20071004174513.A4F0F3A407A@sparrow.telecommunity.com> <470528A9.3050108@libero.it> <20071005002423.320413A407A@sparrow.telecommunity.com> <470614CA.8000300@libero.it> <20071005143356.B8B7D3A407C@sparrow.telecommunity.com> <470654BA.9050100@libero.it> <20071005172839.7C4853A407B@sparrow.telecommunity.com> Message-ID: <47067678.1040809@libero.it> Phillip J. Eby ha scritto: > At 05:14 PM 10/5/2007 +0200, Manlio Perillo wrote: >> Phillip J. Eby ha scritto: >> > At 12:41 PM 10/5/2007 +0200, Manlio Perillo wrote: >> >> Phillip J. Eby ha scritto: >> >> > In other words, those flags were to support legacy frameworks >> detecting >> >> > that they were in an incompatible hosting environment. However, >> IIUC, >> >> > there is no such existing framework that could meaningfully use the >> >> flag >> >> > you're proposing, that has any real chance of being portable to >> >> > different WSGI environments. >> >> >> >> This is true, but I continue to think that it is worth adding that >> flag. >> >> Asynchronous support is available in Nginx mod_wsgi, and in the future >> >> someone can implement a WSGI gateway for lighttpd. >> > >> > Right now, the definition of the flag is not sufficiently defined >> for my >> > taste. You have only proposed that it be set to indicate that >> > interleaved execution is possible -- but it is *always* possible to >> have >> > interleaved execution in WSGI 1.0, so the only reason to add the >> flag to >> > WSGI 2.0 would be so a server could promise NOT to interleave >> > execution. And what good is that? >> > >> >> Ok, here is more useful definition. >> >> If wsgi.asynchronous evaluates to true, then the WSGI application *will* >> be executed into the server main process cycle and thus the application >> execution *will* be interleaved (since this is the only way to support >> multiple concurrent requests). > > I still don't see how this is *useful*. What will the application *do* > with this information? > As an example (not tested) SQLAlchemy can implements a RequestSingletonPool, that is the equivalend of ThreadSingetonPool. In this case the pool will checkout a connection using the environ['wsgi.request_id'] identifier (unique for each request), instead of thread.get_ident. So, a WSGI application *needs* to know if the application is multithreaded or asynchronous to select the right connection pool. Regards Manlio Perillo From pje at telecommunity.com Fri Oct 5 21:11:56 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 05 Oct 2007 15:11:56 -0400 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <47067678.1040809@libero.it> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <470516B0.9010605@libero.it> <20071004165333.4D5353A407A@sparrow.telecommunity.com> <47051BCA.7090709@libero.it> <20071004174513.A4F0F3A407A@sparrow.telecommunity.com> <470528A9.3050108@libero.it> <20071005002423.320413A407A@sparrow.telecommunity.com> <470614CA.8000300@libero.it> <20071005143356.B8B7D3A407C@sparrow.telecommunity.com> <470654BA.9050100@libero.it> <20071005172839.7C4853A407B@sparrow.telecommunity.com> <47067678.1040809@libero.it> Message-ID: <20071005190917.7758D3A407B@sparrow.telecommunity.com> At 07:38 PM 10/5/2007 +0200, Manlio Perillo wrote: >Phillip J. Eby ha scritto: > > I still don't see how this is *useful*. What will the application *do* > > with this information? > >As an example (not tested) SQLAlchemy can implements a >RequestSingletonPool, that is the equivalend of ThreadSingetonPool. > >In this case the pool will checkout a connection using the >environ['wsgi.request_id'] identifier (unique for each request), instead >of thread.get_ident. I still don't see the point of this. Why can't the application just keep a reference to the connection object it's using? That doesn't require any new code and already works now in every existing WSGI server. Why write code that is more complex to do something that you don't even need? Not only that, but the ONLY reasons for the application to yield are if it's sending something too big to fit in memory, or it's doing server push (or otherwise wants to stream the content). Such applications are extremely rare to begin with, or should be. If you are seeing applications that yield multiple strings and *aren't* one of these use cases, it indicates that the application author doesn't understand the WSGI spec, and doesn't realize they're slowing down their application by doing it. Yields are for streaming, and most web applications shouldn't be streaming. That means that 99.9% of all WSGI applications should never produce more than one output string -- which means that the "interleaving" question never even comes up. The applications that produce multiple output strings have to deal with the complexity of the situation anyway. From robinbryce at gmail.com Fri Oct 5 23:13:29 2007 From: robinbryce at gmail.com (Robin Bryce) Date: Fri, 5 Oct 2007 22:13:29 +0100 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <20071005173253.86D293A407B@sparrow.telecommunity.com> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <20071005173253.86D293A407B@sparrow.telecommunity.com> Message-ID: On 05/10/2007, Phillip J. Eby wrote: > At 05:34 PM 10/5/2007 +0100, Robin Bryce wrote: > >Is there a means to support a non blocking read on wsgi.input ? > > No. Some ideas have been proposed, but nobody has shown a practical > scenario where it is useful. > > For it to be useful, you would have to have an asynchronous server > that is interleaving in its main thread, and therefore requires > applications to be non-blocking. It requires asynchronous parts of the wsgi stack to co-operate with the server in order to deal with requests which end up being processed (or part processed) by synchronous components. A requirement to be able to process *some* requests synchronously - for a particular connection - should not prevent a server from supporting both async & synchronous models of processing > > However, to run "normal" WSGI applications, such a server has to > *allow* them to block, so it is going to have to run them in a > different thread anyway. Yes. > > This is why the whole idea of creating an async *variant* of WSGI is > moot - an async WSGI protocol is essentially 100% incompatible with > synchronous WSGI, since any async WSGI components can't use > synchronous WSGI components, unless they spawn another thread or process. This does not have to be the case. All synchronous wsgi components require the presence of wsgi.input which behaves as specified in pep-333. No wsgi async *aware* components exist, because pep-333 does not allow it. async *aware* components, like async servers in general, should be willing to accept greater complexity in the interface. With some additional complexity, exposed WSG 2.0 async aware components, I can't see any reason wsgi 2.0 can't allow for both - provided that async aware components always live at the top of the wsgi stack. Here is my stab at it: Let the async server provide environ['wsgi.async_input'] Some to be agreed non-blocking, iterative, interface to the *content* of a single request. It is legal for an async aware component to call environ['wsgi.async_input'].next(), at most, once for each value of response data it yields. Note that it need not call async_input.next() every time it is resumed. And substitute wsgi.async_input for wsgi.input in my previous message. environ['wsgi.input_factory'] A callable. Which MUST be called by an application which wishes to switch to synchronous processing for the remainder of the current requests content. The application must yield the return value of this factory as the next value it produces. The next time the application is resumed the environ will contain a pep-333 compatible wsgi.input environ key. Applications which call this function MUST accommodate the possibility that that they will be resumed in a different thread from that in which they called wsgi.input_factory Let the server define its own interface for thread / process interaction and provide it via server specific environ keys and expose it through server specific environ keys. Require, as MUST, that the server implementation provides a middle ware component which uses that server specific api to support wsgi.input_factory. Perhaps *disallow* all but the top most wsgi application in the stack from interacting with the server specific threading api. Perhaps define a wsgi.resume_with_result callable such that it can be leveraged *only* by async aware wsgi components - it lets async aware components delegate a callable for execution in a different thread With respect to wsgi.input its helps (me at any rate) to remember that even an async server can not possibly proceed with the next request until it knows it has read (up to or past) the end of the current requests content boundary. WSGI is defined at the per request level there is no need for the async/sync middle ware bridge to 'push back' data. The server sees both Content:close, Content-Length etc, and so can arrange for wsgi.async_input to respect the boundaries. I believe this would be enough to support an asynchronous implementation of Comet. http://en.wikipedia.org/wiki/Comet_%28programming%29 and http://rphd.sourceforge.net/ This sketch is not completely shot from the hip. I have an async server implementation (hey who hasn't these days) which I used mainly as a means to explore *how* a server could possibly interact with an async aware wsgi stack. See http://svn.wiretooth.com/svn/open/asycamore/trunk/asycamore/ and in particular in httpconnectioncontext.py WSGIServiceContext.start_request HTTPServiceContext.continue_reading It does not implement the above sketch but *could* easily do so. > The whole thing is an exercise in futility, until/unless there is > more than one such server and application, at which point they could > get together and create AWSGI or WSGI-A or something of that sort. > > That's to much chicken/egg for my tastes. All you are really saying is that the CGI model covers the majority of 'common' use cases. I don't know of anyone who would disagree with this. But as things stand all wsgi-ish implementations which aim to support async/sync are consigned to the dust bin of 'non conformant'. This acts as a strong disincentive to experiment and innovate. If, for clear technical reasons, nothing can be done so support mixing async aware and synchronous applications in WSGI 2.0, then so it goes. If it can't be done without imposing significant complexity on applications that are perfectly happy with the highly successful wsgi 1.0 model, then fair enough - WSGI-A is a non starter. Or are you against introducing features to support async servers and composition of mixed async/sync stacks on principle ? If a collective decision is made that WSGI will only ever support half async (blocking read, asynchronous response) then both the pep and the new spec should state this very clearly indeed. Best, Robin From pje at telecommunity.com Sat Oct 6 00:07:35 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri, 05 Oct 2007 18:07:35 -0400 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <20071005173253.86D293A407B@sparrow.telecommunity.com> Message-ID: <20071005220455.1ABB23A407B@sparrow.telecommunity.com> At 10:13 PM 10/5/2007 +0100, Robin Bryce wrote: >That's to much chicken/egg for my tastes. All you are really saying is >that the CGI model covers the majority of 'common' use cases. I don't >know of anyone who would disagree with this. But as things stand all >wsgi-ish implementations which aim to support async/sync are consigned >to the dust bin of 'non conformant'. This acts as a strong >disincentive to experiment and innovate. > >If, for clear technical reasons, nothing can be done so support mixing >async aware and synchronous applications in WSGI 2.0, then so it goes. > >If it can't be done without imposing significant complexity on >applications that are perfectly happy with the highly successful wsgi >1.0 model, then fair enough - WSGI-A is a non starter. > >Or are you against introducing features to support async servers and >composition of mixed async/sync stacks on principle ? Not in *principle*, only in practice. :) If you read the archives of a few years back, I was rather enthusiastic until I realized that there really wasn't any way to make it of practical benefit. See, in order for a server to take advantage of an application's "asynchronous" nature, the server has to *know* the application won't "block". That is, the app has to *promise* not to block. (Because without this promise, the server is forced to run the app in a separate thread or process, so as not to block the server.) But in order for the app to make this promise, it can only use components that either make the same promise, unless it runs *them* in other threads or processes... which means giving up on easily composing applications from multiple WSGI components. So far, discussion on this matter has hinged on the claim that it's *possible* to make such mixed stacks, and I don't disagree. What nobody has shown is that it's 1. practical, and 2. produces some actual benefit, compared to the synchronous model now in use. As a practical matter, the vast majority of Python web applications and frameworks are synchronous by nature, and those that aren't are already tied to a specific async API. If we were going to try to implement an asynchronous WSGI, what we would *really* need to do is discard the app_iter and make write() the standard way of sending the body. This would let us implement a CPS (continuation-passing style) API. We would also have to change the input stream so that instead of reading from it, we instead passed it functions to be called when input was available, and so on. We would also need a way to tell write() that we were finished writing, and some way to manage connection timeouts. Unfortunately, this programming style is verbose and more difficult to learn for people versed in less "twisted" ways of programming. To write middleware in this style, you also need to write deeply nested functions. And synchronous servers would need to figure out what to do when an application returns without having called start_response() yet or figured out how to close the stream. Anyway, my point here is that I see how we could either cater to synchronous apps or async apps in a given API. But throwing a half-baked async API on top of a synchronous one is just making a mess and helping no-one. To sketch a WSGI-A application: def app(environ, start_response) start_response('200 Cool', [('content-type','text/plain')]) write('Hello world!') write(None) # close And a WSGI 1->WSGI A converter: class ReadCallbackWrapper: def __init__(self, stream): self.stream = stream def on_read(self, size, callback): callback(self.stream.read(size)) def wsgi_1_app(environ, start_response): running = [1] def sr(*args): write = sr(*args) def w(arg): if running: if arg is None: running.pop() else: write(arg) else: raise RuntimeError("Already closed!") return w environ['wsgi.input'] = ReadCallbackWrapper(environ['wsgi.input']) wsgi_a_app(environ, sr) while running: pass # really should have a timeout check here return [] This highlights the essential difference between a sync and async API: the sync API either finishes right away or returns something the server calls until it's exhausted. An async API offers no guarantee that anything has been done when the app is called. Anything could happen at any time later. My gut feel is that it's harder to write middleware for WSGI-A style of API, because you have to do at least doubly nested functions if you're dealing with the output at all (as this example shows). And if we mix modes, then we have this sort of messy back-and-forth adaptation in between. And as best I can tell, the proposal for a mixed-mode API that you gave would actually make it even *harder* than this to write WSGI middleware, as there would be similar boundary issues for the input stream. From renesd at gmail.com Sat Oct 6 05:07:08 2007 From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=) Date: Sat, 6 Oct 2007 13:07:08 +1000 Subject: [Web-SIG] yield considered harmful (was: x-wsgiorg.flush) In-Reply-To: <20071005165925.0E21A3A407B@sparrow.telecommunity.com> References: <4703ADE1.5040507@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> <88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com> <20071004114441.C7B103A407A@sparrow.telecommunity.com> <88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com> <20071004130818.BFCE83A407A@sparrow.telecommunity.com> <470613B0.8000101@libero.it> <20071005143100.07AD63A407C@sparrow.telecommunity.com> <470650E7.4050809@dealmeida.net> <20071005165925.0E21A3A407B@sparrow.telecommunity.com> Message-ID: <64ddb72c0710052007y3a84eb29wd42aa67d0ec84744@mail.gmail.com> I think 'streaming' is good for speeding up web pages when processing takes a while. I'll explain why... Say your page takes 0.2 seconds to process. If you wait until 0.2 seconds is up, then the first bytes that will come to the browser will arrive in at least 0.2 seconds. Whereas if you send data as soon as its ready, then the user will be able to see some of that data more quickly - and possibly make more requests sooner. However if your application can not send data until it is all ready anyway - which is the way with most python templating languages - then you might as well send it all in one go. Sending it all in one go is faster, unless you can send data as a stream. Sending the header of a html page right away is often very quick for dynamic pages. Since often that part is static - and it contains links to other files - like css, js, and image files. So yielding the header part, then doing your database connection, and page construction which takes longer will almost always be faster for the user - than waiting for the entire page to be ready. On 10/6/07, Phillip J. Eby wrote: > At 11:57 AM 10/5/2007 -0300, Rob De Almeida wrote: > >Phillip J. Eby wrote: > >>I mean that you can't write a WSGI 2.0 application using a single > >>generator function, because it has to return a tuple, not an > >>iterator. This will discourage people from thinking "yield" is a > >>good way to build up their output, instead of using a StringIO or > >>''.join() on a list of strings. > > > >Could you explain why using 'yield' is not recommended? Just > >curious, because I use it all the time. > > Because you're slowing down your application's throughput. The only > reasons to yield multiple strings is when you are either: > > 1. Sending a file that's larger than you want to load into memory, or > > 2. You're doing "server push" and need to do some processing between payloads. > > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: http://mail.python.org/mailman/options/web-sig/renesd%40gmail.com > From pje at telecommunity.com Sat Oct 6 08:03:02 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat, 06 Oct 2007 02:03:02 -0400 Subject: [Web-SIG] yield considered harmful (was: x-wsgiorg.flush) In-Reply-To: <64ddb72c0710052007y3a84eb29wd42aa67d0ec84744@mail.gmail.co m> References: <4703ADE1.5040507@libero.it> <20071003230812.7A7F63A407A@sparrow.telecommunity.com> <88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com> <20071004114441.C7B103A407A@sparrow.telecommunity.com> <88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com> <20071004130818.BFCE83A407A@sparrow.telecommunity.com> <470613B0.8000101@libero.it> <20071005143100.07AD63A407C@sparrow.telecommunity.com> <470650E7.4050809@dealmeida.net> <20071005165925.0E21A3A407B@sparrow.telecommunity.com> <64ddb72c0710052007y3a84eb29wd42aa67d0ec84744@mail.gmail.com> Message-ID: <20071006060023.CEBB53A407B@sparrow.telecommunity.com> At 01:07 PM 10/6/2007 +1000, Ren? Dudfield wrote: >I think 'streaming' is good for speeding up web pages when processing >takes a while. > >I'll explain why... > >Say your page takes 0.2 seconds to process. > >If you wait until 0.2 seconds is up, then the first bytes that will >come to the browser will arrive in at least 0.2 seconds. Whereas if >you send data as soon as its ready, then the user will be able to see >some of that data more quickly - and possibly make more requests >sooner. It's faster for the user, but not necessarily for the server. The server will do more system calls, and the CPU will do more context switches. So, if you're going to stream for purposes of responsiveness, you're going to be trading off against overall server throughput. Nonetheless, the pages where you even have the choice of streaming are infrequent. Most of the examples I see of people doing streaming are completely worthless, because there isn't any non-trivial computation taking place between the yields. From manlio_perillo at libero.it Sat Oct 6 11:04:23 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Sat, 06 Oct 2007 11:04:23 +0200 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <20071005220455.1ABB23A407B@sparrow.telecommunity.com> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <20071005173253.86D293A407B@sparrow.telecommunity.com> <20071005220455.1ABB23A407B@sparrow.telecommunity.com> Message-ID: <47074F97.8040604@libero.it> Phillip J. Eby ha scritto: > At 10:13 PM 10/5/2007 +0100, Robin Bryce wrote: >> That's to much chicken/egg for my tastes. All you are really saying is >> that the CGI model covers the majority of 'common' use cases. I don't >> know of anyone who would disagree with this. But as things stand all >> wsgi-ish implementations which aim to support async/sync are consigned >> to the dust bin of 'non conformant'. This acts as a strong >> disincentive to experiment and innovate. >> >> If, for clear technical reasons, nothing can be done so support mixing >> async aware and synchronous applications in WSGI 2.0, then so it goes. >> I don't see the reason to mix async and sync applications, in the same way that it is not possible to mix a thread unsafe application with a threaded server. WSGI should just *allow* asynchronous applications and middlewares to to their job. As an example, the WSGI write callable cannot be implemented in a conforming way in Nginx. However, if we can allow the write callable to raise an EAGAIN exception when the buffer cannot be written to the socket, with the requirement that the WSGI application, in this case, MUST return control to the server (yielding an empty string as an example), then the write callable can be implemented. > [...] > > If we were going to try to implement an asynchronous WSGI, what we would > *really* need to do is discard the app_iter and make write() the > standard way of sending the body. This would let us implement a CPS > (continuation-passing style) API. But isn't this possible just using a generator? > We would also have to change the > input stream so that instead of reading from it, we instead passed it > functions to be called when input was available, Another possible solution is that reading from input is allowed to raise an EAGAIN exception, like in the previous example. > [...] Regards Manlio Perilo From robinbryce at gmail.com Sat Oct 6 14:23:49 2007 From: robinbryce at gmail.com (Robin Bryce) Date: Sat, 6 Oct 2007 13:23:49 +0100 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <47074F97.8040604@libero.it> References: <4704222D.30208@colorstudy.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <20071005173253.86D293A407B@sparrow.telecommunity.com> <20071005220455.1ABB23A407B@sparrow.telecommunity.com> <47074F97.8040604@libero.it> Message-ID: On 06/10/2007, Manlio Perillo wrote: > Phillip J. Eby ha scritto: > > At 10:13 PM 10/5/2007 +0100, Robin Bryce wrote: > >> That's to much chicken/egg for my tastes. All you are really saying is > >> that the CGI model covers the majority of 'common' use cases. I don't > >> know of anyone who would disagree with this. But as things stand all > >> wsgi-ish implementations which aim to support async/sync are consigned > >> to the dust bin of 'non conformant'. This acts as a strong > >> disincentive to experiment and innovate. > >> > >> If, for clear technical reasons, nothing can be done so support mixing > >> async aware and synchronous applications in WSGI 2.0, then so it goes. > >> > > I don't see the reason to mix async and sync applications, in the same > way that it is not possible to mix a thread unsafe application with a > threaded server. > > WSGI should just *allow* asynchronous applications and middlewares to to > their job. > > As an example, the WSGI write callable cannot be implemented in a > conforming way in Nginx. > > However, if we can allow the write callable to raise an EAGAIN exception > when the buffer cannot be written to the socket, with the requirement > that the WSGI application, in this case, MUST return control to the > server (yielding an empty string as an example), then the write callable > can be implemented. > > > [...] > > > > If we were going to try to implement an asynchronous WSGI, what we would > > *really* need to do is discard the app_iter and make write() the > > standard way of sending the body. This would let us implement a CPS > > (continuation-passing style) API. > > But isn't this possible just using a generator? > > > > We would also have to change the > > input stream so that instead of reading from it, we instead passed it > > functions to be called when input was available, > > Another possible solution is that reading from input is allowed to raise > an EAGAIN exception, like in the previous example. > > > [...] > > > > Regards Manlio Perilo > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: http://mail.python.org/mailman/options/web-sig/robinbryce%40gmail.com > From robinbryce at gmail.com Sat Oct 6 14:34:10 2007 From: robinbryce at gmail.com (Robin Bryce) Date: Sat, 6 Oct 2007 13:34:10 +0100 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <47074F97.8040604@libero.it> References: <4704222D.30208@colorstudy.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <20071005173253.86D293A407B@sparrow.telecommunity.com> <20071005220455.1ABB23A407B@sparrow.telecommunity.com> <47074F97.8040604@libero.it> Message-ID: Ignore last, over sensitive laptop touch pad :) On 06/10/2007, Manlio Perillo wrote: > Phillip J. Eby ha scritto: > > At 10:13 PM 10/5/2007 +0100, Robin Bryce wrote: > >> That's to much chicken/egg for my tastes. All you are really saying is > >> that the CGI model covers the majority of 'common' use cases. I don't > >> know of anyone who would disagree with this. But as things stand all > >> wsgi-ish implementations which aim to support async/sync are consigned > >> to the dust bin of 'non conformant'. This acts as a strong > >> disincentive to experiment and innovate. > >> > >> If, for clear technical reasons, nothing can be done so support mixing > >> async aware and synchronous applications in WSGI 2.0, then so it goes. > >> > > I don't see the reason to mix async and sync applications, in the same > way that it is not possible to mix a thread unsafe application with a > threaded server. > There are plenty of stateless synchronous wsgi components out there that I would like the option of serving as is. As the person choosing the components in my wsgi stack I'm perfectly capable of deciding whether such a synchronous app is safe in the context of an asynch server. From robinbryce at gmail.com Sat Oct 6 16:33:01 2007 From: robinbryce at gmail.com (Robin Bryce) Date: Sat, 6 Oct 2007 15:33:01 +0100 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <20071005220455.1ABB23A407B@sparrow.telecommunity.com> References: <4704222D.30208@colorstudy.com> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <20071005173253.86D293A407B@sparrow.telecommunity.com> <20071005220455.1ABB23A407B@sparrow.telecommunity.com> Message-ID: On 05/10/2007, Phillip J. Eby wrote: > At 10:13 PM 10/5/2007 +0100, Robin Bryce wrote: > >That's to much chicken/egg for my tastes. All you are really saying is > >that the CGI model covers the majority of 'common' use cases. I don't > >know of anyone who would disagree with this. But as things stand all > >wsgi-ish implementations which aim to support async/sync are consigned > >to the dust bin of 'non conformant'. This acts as a strong > >disincentive to experiment and innovate. > > > >If, for clear technical reasons, nothing can be done so support mixing > >async aware and synchronous applications in WSGI 2.0, then so it goes. > > > >If it can't be done without imposing significant complexity on > >applications that are perfectly happy with the highly successful wsgi > >1.0 model, then fair enough - WSGI-A is a non starter. > > > >Or are you against introducing features to support async servers and > >composition of mixed async/sync stacks on principle ? > > Not in *principle*, only in practice. :) If you read the archives > of a few years back, I was rather enthusiastic until I realized that > there really wasn't any way to make it of practical benefit. I have tried to follow the history of "we want more asynch support in wsgi" but I don't think I've kept up with you on this. > See, in order for a server to take advantage of an application's > "asynchronous" nature, the server has to *know* the application won't > "block". That is, the app has to *promise* not to block. (Because > without this promise, the server is forced to run the app in a > separate thread or process, so as not to block the server.) > > But in order for the app to make this promise, it can only use > components that either make the same promise, unless it runs *them* > in other threads or processes... which means giving up on easily > composing applications from multiple WSGI components. > Which is why I drew a distinction between async *aware* components and others and advocated a composition model in which the composer of the wsgi stack must guarantee that async aware components live at the top. Ie, a synchronous component can not sensibly be provided with a means to drive an async aware component. This places the burden of the composition problem firmly on the server and those components writen specifically to be async aware and yet allows those components to take advantage synchronous components from time to time. > So far, discussion on this matter has hinged on the claim that it's > *possible* to make such mixed stacks, and I don't disagree. What > nobody has shown is that it's 1. practical, and 2. produces some > actual benefit, compared to the synchronous model now in use. As a > practical matter, the vast majority of Python web applications and > frameworks are synchronous by nature, and those that aren't are > already tied to a specific async API. > > If we were going to try to implement an asynchronous WSGI, what we > would *really* need to do is discard the app_iter and make write() > the standard way of sending the body. This would let us implement a > CPS (continuation-passing style) API. We would also have to change > the input stream so that instead of reading from it, we instead > passed it functions to be called when input was available, and so > on. We would also need a way to tell write() that we were finished > writing, and some way to manage connection timeouts. > I don't understand why you think this is necessary. I especially don't like the thought that there is an argument that useful and performant wsgi-a support is impossible without requiring use of CSP. I *like* the app_iter model and believe it is perfectly workable for an async component - provided that: 1. There is a non-blocking variant of wsgi.input say wsgi.async_input 2. There is a means for an async aware component to signal the server that it should process the remainder of the current request in a synchronous manner. 3. The server and async aware components are allowed to use an extended set of yield values which provide the co-operative communication necessary for performant async components. 3a. A yield that means "don't resume me until there is more data available on wsgi.async_input" 3b. A yield that means "I ran out of data reading from wsgi.async_input but please continue resuming me anyway as I have useful work to do" And a yield of the empty string means the same as it does for wsgi 1.0 3a & 3b allows the component to pass "up" the information that the server needs to determine that the underlying socket has encountered EAGAIN on recv. The async aware component *knows* what its last yield was and so can reliably interpret resume after 3a as meaning "more data available". After 3b it does no harm to the perfomance of the server if the component speculatively attempts to read from wsgi.async_input. Absence of wsgi.input in the environ until the 'switch' takes place will cause any accidentally included synchronous application to break if it attempts to perform a blocking read on the input. An async server should have no problem with synchronous applications that *dont* use wsgi.input yes ? > Unfortunately, this programming style is verbose and more difficult > to learn for people versed in less "twisted" ways of programming. To > write middleware in this style, you also need to write deeply nested > functions. And synchronous servers would need to figure out what to > do when an application returns without having called start_response() > yet or figured out how to close the stream. Agreed. I have always assumed that async aware components would be incompatible with synchronous servers. > Anyway, my point here is that I see how we could either cater to > synchronous apps or async apps in a given API. But throwing a > half-baked async API on top of a synchronous one is just making a > mess and helping no-one. ... > My gut feel is that it's harder to write middleware for WSGI-A style > of API, because you have to do at least doubly nested functions if > you're dealing with the output at all (as this example shows). > > And if we mix modes, then we have this sort of messy back-and-forth > adaptation in between. And as best I can tell, the proposal for a > mixed-mode API that you gave would actually make it even *harder* > than this to write WSGI middleware, as there would be similar > boundary issues for the input stream. No I'm definitely not advocating mixed modes. I'm saying that I want a means to allow an async aware component to switch the current request to synchronous processing for the remainder of the request. And explicitly _dont_ think its sensible to attempt to support synchronous -> asynchronous. The only reason for supporting the switch at all is to enable async aware components to leverage synchronous components "from time to time". Async aware components would be harder to write than synchronous but synchronous components would remain as they are. And, by avoiding CSP, asynchronous servers could freely leverage wsgi 1.0 style components which don't consume wsgi.input Perhaps I should attempt an asyncwsgiref, which by my definition should be able to host apps in wsgiref but not the converse. More to say but out of time for today. Cheers, Robin From pje at telecommunity.com Sat Oct 6 16:36:17 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat, 06 Oct 2007 10:36:17 -0400 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <47074F97.8040604@libero.it> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <20071005173253.86D293A407B@sparrow.telecommunity.com> <20071005220455.1ABB23A407B@sparrow.telecommunity.com> <47074F97.8040604@libero.it> Message-ID: <20071006143337.D0BFF3A407A@sparrow.telecommunity.com> At 11:04 AM 10/6/2007 +0200, Manlio Perillo wrote: >As an example, the WSGI write callable cannot be implemented in a >conforming way in Nginx. ...unless you use a separate thread or process. > > If we were going to try to implement an asynchronous WSGI, what we would > > *really* need to do is discard the app_iter and make write() the > > standard way of sending the body. This would let us implement a CPS > > (continuation-passing style) API. > >But isn't this possible just using a generator? No, because using a generator means there needs to be a separate callback to force the generator to be reiterated. Hence the complexity of adding an async API to the existing WSGI model. > > We would also have to change the > > input stream so that instead of reading from it, we instead passed it > > functions to be called when input was available, > >Another possible solution is that reading from input is allowed to raise >an EAGAIN exception, like in the previous example. Which is *way* more complex than the CPS approach. If we're going to make it *harder* to write applications, there's no point to having a WSGI 2.0, since 1.0 is already hard enough to implement. :) From pje at telecommunity.com Sat Oct 6 16:40:00 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat, 06 Oct 2007 10:40:00 -0400 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: References: <4704222D.30208@colorstudy.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <20071005173253.86D293A407B@sparrow.telecommunity.com> <20071005220455.1ABB23A407B@sparrow.telecommunity.com> <47074F97.8040604@libero.it> Message-ID: <20071006143721.7EC1F3A407A@sparrow.telecommunity.com> At 01:34 PM 10/6/2007 +0100, Robin Bryce wrote: >There are plenty of stateless synchronous wsgi components out there >that I would like the option of serving as is. As the person choosing >the components in my wsgi stack I'm perfectly capable of deciding >whether such a synchronous app is safe in the context of an asynch >server. Only if you break encapsulation, composability, and scalability of construction by choosing to know how each and every component works. The whole idea of a component is that you shouldn't HAVE TO know what components are being used inside of it. Otherwise, it's not really a "component". From manlio_perillo at libero.it Sat Oct 6 17:48:40 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Sat, 06 Oct 2007 17:48:40 +0200 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <20071006143337.D0BFF3A407A@sparrow.telecommunity.com> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <20071005173253.86D293A407B@sparrow.telecommunity.com> <20071005220455.1ABB23A407B@sparrow.telecommunity.com> <47074F97.8040604@libero.it> <20071006143337.D0BFF3A407A@sparrow.telecommunity.com> Message-ID: <4707AE58.3040003@libero.it> Phillip J. Eby ha scritto: > At 11:04 AM 10/6/2007 +0200, Manlio Perillo wrote: >> As an example, the WSGI write callable cannot be implemented in a >> conforming way in Nginx. > > ...unless you use a separate thread or process. > I'm insisting about asynchronous support in WSGI because Nginx *does not supports threads*. It has some thread support but it is *broken*. Even if in future the problems are solved, the threading model of Nginx *will break* many existing WSGI applications, since the WSGI iteration can be resumed in different threads. Of course, a WSGI application can use threads, but i think that it *needs* a wsgi.pause_output extension, for synchronization. > [...] >> Another possible solution is that reading from input is allowed to raise >> an EAGAIN exception, like in the previous example. > > Which is *way* more complex than the CPS approach. If we're going to > make it *harder* to write applications, there's no point to having a > WSGI 2.0, since 1.0 is already hard enough to implement. :) > It is a know fact that asynchronous programming is hard. Multithread programming is even more harder, but nobody seems to care. Regards Manlio Perillo From graham.dumpleton at gmail.com Sun Oct 7 00:47:36 2007 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Sun, 7 Oct 2007 08:47:36 +1000 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <4707AE58.3040003@libero.it> References: <4704222D.30208@colorstudy.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <20071005173253.86D293A407B@sparrow.telecommunity.com> <20071005220455.1ABB23A407B@sparrow.telecommunity.com> <47074F97.8040604@libero.it> <20071006143337.D0BFF3A407A@sparrow.telecommunity.com> <4707AE58.3040003@libero.it> Message-ID: <88e286470710061547r373b431du7e77b51fc2083614@mail.gmail.com> On 07/10/2007, Manlio Perillo wrote: > Phillip J. Eby ha scritto: > > At 11:04 AM 10/6/2007 +0200, Manlio Perillo wrote: > >> As an example, the WSGI write callable cannot be implemented in a > >> conforming way in Nginx. > > > > ...unless you use a separate thread or process. > > > > I'm insisting about asynchronous support in WSGI because > Nginx *does not supports threads*. > > It has some thread support but it is *broken*. > Even if in future the problems are solved, the threading model of Nginx > *will break* many existing WSGI applications, since the WSGI iteration > can be resumed in different threads. > > Of course, a WSGI application can use threads, but i think that it > *needs* a wsgi.pause_output extension, for synchronization. I appreciate that you can't use the thread support in nginx, but what I don't understand is why you can't utililise Python threading API (or even POSIX threads) at the boundary between nginx and the interface into the WSGI application, ie., in the WSGI adapter layer, so as to emulate a synchronous style WSGI interface on top of the nginx event driven layer. In other words you hide all the complexity of any queues or other synchronisation mechanisms for communicating any data between the two within the adapter. This way you do not need to expose an asynchronous API to the WSGI application itself and existing WSGI code can be used as is. Graham From pje at telecommunity.com Sun Oct 7 05:42:11 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat, 06 Oct 2007 23:42:11 -0400 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <4707AE58.3040003@libero.it> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <20071005173253.86D293A407B@sparrow.telecommunity.com> <20071005220455.1ABB23A407B@sparrow.telecommunity.com> <47074F97.8040604@libero.it> <20071006143337.D0BFF3A407A@sparrow.telecommunity.com> <4707AE58.3040003@libero.it> Message-ID: <20071007033932.2A44F3A407B@sparrow.telecommunity.com> At 05:48 PM 10/6/2007 +0200, Manlio Perillo wrote: >Phillip J. Eby ha scritto: > > At 11:04 AM 10/6/2007 +0200, Manlio Perillo wrote: > >> As an example, the WSGI write callable cannot be implemented in a > >> conforming way in Nginx. > > > > ...unless you use a separate thread or process. > > > >I'm insisting about asynchronous support in WSGI because >Nginx *does not supports threads*. Please note that this means you can't run WSGI applications in the same process, then, since WSGI applications can and do block - meaning that the server will stop serving requests. From foom at fuhm.net Sun Oct 7 08:45:46 2007 From: foom at fuhm.net (James Y Knight) Date: Sun, 7 Oct 2007 02:45:46 -0400 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: References: <4704222D.30208@colorstudy.com> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <20071005173253.86D293A407B@sparrow.telecommunity.com> <20071005220455.1ABB23A407B@sparrow.telecommunity.com> Message-ID: <56E0E17C-6BC0-440E-A980-49BC880EEFBA@fuhm.net> On Oct 6, 2007, at 10:33 AM, Robin Bryce wrote: > An async > server should have no problem with synchronous applications that > *dont* use wsgi.input yes ? That's certainly not the case. One of the more popular things to do in a webapp is talk to a database. Most such accesses are done in a blocking fashion. Doing blocking database access in an asynchronous server's event loop is a pretty poor idea. I mean, sure, it'd probably "work", but your performance would be terrible... James From manlio_perillo at libero.it Sun Oct 7 12:16:06 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Sun, 07 Oct 2007 12:16:06 +0200 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <88e286470710061547r373b431du7e77b51fc2083614@mail.gmail.com> References: <4704222D.30208@colorstudy.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <20071005173253.86D293A407B@sparrow.telecommunity.com> <20071005220455.1ABB23A407B@sparrow.telecommunity.com> <47074F97.8040604@libero.it> <20071006143337.D0BFF3A407A@sparrow.telecommunity.com> <4707AE58.3040003@libero.it> <88e286470710061547r373b431du7e77b51fc2083614@mail.gmail.com> Message-ID: <4708B1E6.4090401@libero.it> Graham Dumpleton ha scritto: > On 07/10/2007, Manlio Perillo wrote: >> Phillip J. Eby ha scritto: >>> At 11:04 AM 10/6/2007 +0200, Manlio Perillo wrote: >>>> As an example, the WSGI write callable cannot be implemented in a >>>> conforming way in Nginx. >>> ...unless you use a separate thread or process. >>> >> I'm insisting about asynchronous support in WSGI because >> Nginx *does not supports threads*. >> >> It has some thread support but it is *broken*. >> Even if in future the problems are solved, the threading model of Nginx >> *will break* many existing WSGI applications, since the WSGI iteration >> can be resumed in different threads. >> >> Of course, a WSGI application can use threads, but i think that it >> *needs* a wsgi.pause_output extension, for synchronization. > > I appreciate that you can't use the thread support in nginx, but what > I don't understand is why you can't utililise Python threading API (or > even POSIX threads) at the boundary between nginx and the interface > into the WSGI application, ie., in the WSGI adapter layer, so as to > emulate a synchronous style WSGI interface on top of the nginx event > driven layer. This is possible, but I think that it is better to offer a basic asynchronous support in WSGI, since in this way it is possible to build threading support in pure Python *and*, more important, this support is reusable by other implementations. > In other words you hide all the complexity of any queues > or other synchronisation mechanisms for communicating any data between > the two within the adapter. This way you do not need to expose an > asynchronous API to the WSGI application itself and existing WSGI code > can be used as is. > The Python threading support can be implemented as a "middleware", so it is trasparent to the WSGI application. Not sure if it can be called "middleware", however. Regards Manlio Perillo From manlio_perillo at libero.it Sun Oct 7 12:17:29 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Sun, 07 Oct 2007 12:17:29 +0200 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <20071007033932.2A44F3A407B@sparrow.telecommunity.com> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <20071005173253.86D293A407B@sparrow.telecommunity.com> <20071005220455.1ABB23A407B@sparrow.telecommunity.com> <47074F97.8040604@libero.it> <20071006143337.D0BFF3A407A@sparrow.telecommunity.com> <4707AE58.3040003@libero.it> <20071007033932.2A44F3A407B@sparrow.telecommunity.com> Message-ID: <4708B239.6080008@libero.it> Phillip J. Eby ha scritto: > At 05:48 PM 10/6/2007 +0200, Manlio Perillo wrote: >> Phillip J. Eby ha scritto: >> > At 11:04 AM 10/6/2007 +0200, Manlio Perillo wrote: >> >> As an example, the WSGI write callable cannot be implemented in a >> >> conforming way in Nginx. >> > >> > ...unless you use a separate thread or process. >> > >> >> I'm insisting about asynchronous support in WSGI because >> Nginx *does not supports threads*. > > Please note that this means you can't run WSGI applications in the same > process, then, since WSGI applications can and do block - meaning that > the server will stop serving requests. > http://hg.mperillo.ath.cx/nginx/mod_wsgi/file/tip/README, in the Notes. Regards Manlio Perillo From graham.dumpleton at gmail.com Sun Oct 7 13:04:09 2007 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Sun, 7 Oct 2007 21:04:09 +1000 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <4708B1E6.4090401@libero.it> References: <4704222D.30208@colorstudy.com> <20071005173253.86D293A407B@sparrow.telecommunity.com> <20071005220455.1ABB23A407B@sparrow.telecommunity.com> <47074F97.8040604@libero.it> <20071006143337.D0BFF3A407A@sparrow.telecommunity.com> <4707AE58.3040003@libero.it> <88e286470710061547r373b431du7e77b51fc2083614@mail.gmail.com> <4708B1E6.4090401@libero.it> Message-ID: <88e286470710070404i63e47c99wa20ad135a2e364af@mail.gmail.com> On 07/10/2007, Manlio Perillo wrote: > Graham Dumpleton ha scritto: > > On 07/10/2007, Manlio Perillo wrote: > >> Phillip J. Eby ha scritto: > >>> At 11:04 AM 10/6/2007 +0200, Manlio Perillo wrote: > >>>> As an example, the WSGI write callable cannot be implemented in a > >>>> conforming way in Nginx. > >>> ...unless you use a separate thread or process. > >>> > >> I'm insisting about asynchronous support in WSGI because > >> Nginx *does not supports threads*. > >> > >> It has some thread support but it is *broken*. > >> Even if in future the problems are solved, the threading model of Nginx > >> *will break* many existing WSGI applications, since the WSGI iteration > >> can be resumed in different threads. > >> > >> Of course, a WSGI application can use threads, but i think that it > >> *needs* a wsgi.pause_output extension, for synchronization. > > > > I appreciate that you can't use the thread support in nginx, but what > > I don't understand is why you can't utililise Python threading API (or > > even POSIX threads) at the boundary between nginx and the interface > > into the WSGI application, ie., in the WSGI adapter layer, so as to > > emulate a synchronous style WSGI interface on top of the nginx event > > driven layer. > > This is possible, but I think that it is better to offer a basic > asynchronous support in WSGI, since in this way it is possible to build > threading support in pure Python *and*, more important, this support is > reusable by other implementations. > > > In other words you hide all the complexity of any queues > > or other synchronisation mechanisms for communicating any data between > > the two within the adapter. This way you do not need to expose an > > asynchronous API to the WSGI application itself and existing WSGI code > > can be used as is. > > > > The Python threading support can be implemented as a "middleware", so it > is trasparent to the WSGI application. > > Not sure if it can be called "middleware", however. If providing support for synchronous WSGI by using an adapter is how you would support that, then I think all your problems would be solved very easily by not trying to push that asynchronous support be added to WSGI itself. Instead, come up with your own independent asynchronous Python API for nginx and call it something completely different and not try and get it labeled as being WSGI in some way. In other words, don't call your nginx module mod_wsgi but mod_pynginx for example. Having done that, then offer as a separate package a synchronous WSGI adapter for your mod_pynginx and clearly state that although your module doesn't support WSGI directly, it does via the separate WSGI adapter. The reason you are getting so much push back here on this list is because you are trying to turn WSGI in to something it isn't when there isn't a need to as you could still provide support for the current WSGI specification as is by taking the adapter approach instead. What you would end up with is not much different to how Apache mod_python has a number of WSGI adapters available for it. In some respects it would probably be more attractive to people for you to provide a Python API for using nginx which better matches how nginx works and allows the most performance to be gotten out of nginx for Python applications, without binding yourself to WSGI. That way, if people choose to work with your lower level API then they could and write applications specifically for nginx in much the same way that people write applications specifically for Apache using mod_python. So, don't try and force your API to be WSGI, and at the same time don't try and force the WSGI specification to change so you can call what you are developing WSGI. Doing either is possibly only going to limit the extent to which you could develop your nginx specific Python API. You would be much better doing your API however you want, call it something different, but then provide a WSGI adapter for those want to run WSGI applications on top of it. Graham From ianb at colorstudy.com Mon Oct 8 02:37:05 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Sun, 07 Oct 2007 19:37:05 -0500 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <4707AE58.3040003@libero.it> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <20071005173253.86D293A407B@sparrow.telecommunity.com> <20071005220455.1ABB23A407B@sparrow.telecommunity.com> <47074F97.8040604@libero.it> <20071006143337.D0BFF3A407A@sparrow.telecommunity.com> <4707AE58.3040003@libero.it> Message-ID: <47097BB1.502@colorstudy.com> Manlio Perillo wrote: > Phillip J. Eby ha scritto: >> At 11:04 AM 10/6/2007 +0200, Manlio Perillo wrote: >>> As an example, the WSGI write callable cannot be implemented in a >>> conforming way in Nginx. >> ...unless you use a separate thread or process. >> > > I'm insisting about asynchronous support in WSGI because > Nginx *does not supports threads*. > > It has some thread support but it is *broken*. > Even if in future the problems are solved, the threading model of Nginx > *will break* many existing WSGI applications, since the WSGI iteration > can be resumed in different threads. Just so you are aware -- almost all current WSGI applications block, and can't be run in asynchronous environments. So if you are writing WSGI support that doesn't support applications that block, well, it won't really be able to do much with current WSGI code. -- Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org : Write code, do good : http://topp.openplans.org/careers From manlio_perillo at libero.it Mon Oct 8 13:02:27 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Mon, 08 Oct 2007 13:02:27 +0200 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <47097BB1.502@colorstudy.com> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <20071005173253.86D293A407B@sparrow.telecommunity.com> <20071005220455.1ABB23A407B@sparrow.telecommunity.com> <47074F97.8040604@libero.it> <20071006143337.D0BFF3A407A@sparrow.telecommunity.com> <4707AE58.3040003@libero.it> <47097BB1.502@colorstudy.com> Message-ID: <470A0E43.3040806@libero.it> Ian Bicking ha scritto: > Manlio Perillo wrote: >> Phillip J. Eby ha scritto: >>> At 11:04 AM 10/6/2007 +0200, Manlio Perillo wrote: >>>> As an example, the WSGI write callable cannot be implemented in a >>>> conforming way in Nginx. >>> ...unless you use a separate thread or process. >>> >> >> I'm insisting about asynchronous support in WSGI because >> Nginx *does not supports threads*. >> >> It has some thread support but it is *broken*. >> Even if in future the problems are solved, the threading model of >> Nginx *will break* many existing WSGI applications, since the WSGI >> iteration can be resumed in different threads. > > Just so you are aware -- almost all current WSGI applications block, and > can't be run in asynchronous environments. Not every WSGI application "blocks" the request processing for a "sensible" amount of time. A streaming application, as an example, can "block" without problems, since nginx mod_wsgi will pause the execution as soon as the application output cannot be written at once to the client. Moreover, as I have already written, using the wsgi.pause_output, it should possible to write a WSGI "component" that run the entire WSGI application in a separate thread (but, in this case, it MUST buffer the entire output, since nginx is not thread safe). Nginx can also use several worker processes, so it can still (somehow) serve "blocking" applications. > So if you are writing WSGI > support that doesn't support applications that block, well, it won't > really be able to do much with current WSGI code. > Supporting "legacy" and "huge" WSGI applications is not really a priority for me. I want some support for adding extensions that can be used by other WSGI implementations that want to support asynchronous applications in asynchronous server. I can add "proprietary" extensions, but Python is already full of not interoperable web solutions. P.S. Since, as I can see, many people on this mailing list are not interested in asynchronous support for WSGI, we can stop this thread (and further discussions) here. Regards Manlio Perillo From pje at telecommunity.com Mon Oct 8 13:17:11 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 08 Oct 2007 07:17:11 -0400 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <470A0E43.3040806@libero.it> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <20071005173253.86D293A407B@sparrow.telecommunity.com> <20071005220455.1ABB23A407B@sparrow.telecommunity.com> <47074F97.8040604@libero.it> <20071006143337.D0BFF3A407A@sparrow.telecommunity.com> <4707AE58.3040003@libero.it> <47097BB1.502@colorstudy.com> <470A0E43.3040806@libero.it> Message-ID: <20071008111827.32CD83A407C@sparrow.telecommunity.com> At 01:02 PM 10/8/2007 +0200, Manlio Perillo wrote: >Supporting "legacy" and "huge" WSGI applications is not really a >priority for me. Then you should really make it clear to your users that your Nginx module does not support WSGI. The entire point of WSGI is to allow "legacy" (i.e. already-written applications) to be portable across servers. Something that doesn't run existing WSGI apps is clearly not WSGI. From manlio_perillo at libero.it Mon Oct 8 13:48:55 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Mon, 08 Oct 2007 13:48:55 +0200 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <20071008111827.32CD83A407C@sparrow.telecommunity.com> References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it> <20071004142648.D2AFB3A407A@sparrow.telecommunity.com> <4704FD32.9020604@libero.it> <20071004153734.1DFA33A407A@sparrow.telecommunity.com> <47050CC7.9030500@libero.it> <20071004161810.060183A407A@sparrow.telecommunity.com> <20071005173253.86D293A407B@sparrow.telecommunity.com> <20071005220455.1ABB23A407B@sparrow.telecommunity.com> <47074F97.8040604@libero.it> <20071006143337.D0BFF3A407A@sparrow.telecommunity.com> <4707AE58.3040003@libero.it> <47097BB1.502@colorstudy.com> <470A0E43.3040806@libero.it> <20071008111827.32CD83A407C@sparrow.telecommunity.com> Message-ID: <470A1927.5080403@libero.it> Phillip J. Eby ha scritto: > At 01:02 PM 10/8/2007 +0200, Manlio Perillo wrote: >> Supporting "legacy" and "huge" WSGI applications is not really a >> priority for me. > > Then you should really make it clear to your users that your Nginx > module does not support WSGI. The entire point of WSGI is to allow > "legacy" (i.e. already-written applications) to be portable across > servers. Something that doesn't run existing WSGI apps is clearly not > WSGI. > [Here I respond to the latest post of Graham, too.] Right, but actually nginx mod_wsgi *can* execute every WSGI application in a *conforming* way (I'm completing full support for WSGI 2.0, and after this I will implement WSGI 1.0). Of course some classes of WSGI applications runs *better* if they don't block the nginx process loop too much, so that nginx can serve multiple requests at the same time. It is simply a matter of optimized execution. Regards Manlio Perillo From graham.dumpleton at gmail.com Mon Oct 8 13:53:44 2007 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Mon, 8 Oct 2007 21:53:44 +1000 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <470A1927.5080403@libero.it> References: <4704222D.30208@colorstudy.com> <20071005220455.1ABB23A407B@sparrow.telecommunity.com> <47074F97.8040604@libero.it> <20071006143337.D0BFF3A407A@sparrow.telecommunity.com> <4707AE58.3040003@libero.it> <47097BB1.502@colorstudy.com> <470A0E43.3040806@libero.it> <20071008111827.32CD83A407C@sparrow.telecommunity.com> <470A1927.5080403@libero.it> Message-ID: <88e286470710080453k619c0a83kfa67c3bde986a67@mail.gmail.com> On 08/10/2007, Manlio Perillo wrote: > Phillip J. Eby ha scritto: > > At 01:02 PM 10/8/2007 +0200, Manlio Perillo wrote: > >> Supporting "legacy" and "huge" WSGI applications is not really a > >> priority for me. > > > > Then you should really make it clear to your users that your Nginx > > module does not support WSGI. The entire point of WSGI is to allow > > "legacy" (i.e. already-written applications) to be portable across > > servers. Something that doesn't run existing WSGI apps is clearly not > > WSGI. > > > > [Here I respond to the latest post of Graham, too.] > > Right, but actually nginx mod_wsgi *can* execute every WSGI application > in a *conforming* way (I'm completing full support for WSGI 2.0, and > after this I will implement WSGI 1.0). > > Of course some classes of WSGI applications runs *better* if they don't > block the nginx process loop too much, so that nginx can serve multiple > requests at the same time. > > It is simply a matter of optimized execution. Do note that there only exists WSGI 1.0. There is no such thing as WSGI 2.0 as yet and you shouldn't really assume that the list of proposed ideas for discussion will actually end up producing anything that looks like what is described. All you can really do at present is implement WSGI 1.0, anything else is not WSGI and certainly not WSGI 2.0. Graham From manlio_perillo at libero.it Mon Oct 8 13:57:59 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Mon, 08 Oct 2007 13:57:59 +0200 Subject: [Web-SIG] WSGI 2.0 In-Reply-To: <88e286470710080453k619c0a83kfa67c3bde986a67@mail.gmail.com> References: <4704222D.30208@colorstudy.com> <20071005220455.1ABB23A407B@sparrow.telecommunity.com> <47074F97.8040604@libero.it> <20071006143337.D0BFF3A407A@sparrow.telecommunity.com> <4707AE58.3040003@libero.it> <47097BB1.502@colorstudy.com> <470A0E43.3040806@libero.it> <20071008111827.32CD83A407C@sparrow.telecommunity.com> <470A1927.5080403@libero.it> <88e286470710080453k619c0a83kfa67c3bde986a67@mail.gmail.com> Message-ID: <470A1B47.7090908@libero.it> Graham Dumpleton ha scritto: > On 08/10/2007, Manlio Perillo wrote: >> Phillip J. Eby ha scritto: >>> At 01:02 PM 10/8/2007 +0200, Manlio Perillo wrote: >>>> Supporting "legacy" and "huge" WSGI applications is not really a >>>> priority for me. >>> Then you should really make it clear to your users that your Nginx >>> module does not support WSGI. The entire point of WSGI is to allow >>> "legacy" (i.e. already-written applications) to be portable across >>> servers. Something that doesn't run existing WSGI apps is clearly not >>> WSGI. >>> >> [Here I respond to the latest post of Graham, too.] >> >> Right, but actually nginx mod_wsgi *can* execute every WSGI application >> in a *conforming* way (I'm completing full support for WSGI 2.0, and >> after this I will implement WSGI 1.0). >> >> Of course some classes of WSGI applications runs *better* if they don't >> block the nginx process loop too much, so that nginx can serve multiple >> requests at the same time. >> >> It is simply a matter of optimized execution. > > Do note that there only exists WSGI 1.0. There is no such thing as > WSGI 2.0 as yet and you shouldn't really assume that the list of > proposed ideas for discussion will actually end up producing anything > that looks like what is described. All you can really do at present is > implement WSGI 1.0, anything else is not WSGI and certainly not WSGI > 2.0. > Right, and in the nginx mod_wsgi README I explicitly write that the current version is implementing the WSGI *draft*. The reason I'm implementing the WSGI 2.0 draft is that it allows a more simple code flow. > Graham > Regards Manlio Perillo From manlio_perillo at libero.it Mon Oct 8 18:25:00 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Mon, 08 Oct 2007 18:25:00 +0200 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <20071003175813.7DCEA3A407A@sparrow.telecommunity.com> References: <4703ADE1.5040507@libero.it> <20071003165020.23FAA3A407A@sparrow.telecommunity.com> <4703CB72.6080308@libero.it> <20071003175813.7DCEA3A407A@sparrow.telecommunity.com> Message-ID: <470A59DC.1060905@libero.it> Phillip J. Eby ha scritto: > [...] > > I don't think there's any point to having a WSGI extension for If-* > header support. I have just found that the WSGI spec says: """...it should be clear that a server may handle cache validation via the If-None-Match and If-Modified-Since request headers and the Last-Modified and ETag response headers.""" So a WSGI implementation is *allowed* to perform cache validation, but it is not clear *how* this should be done. As an example, without the need of an extension, the start_response callable may check if Last-Modified or ETag is in the headers. In this case, it may perform a cache validation, and if the client representation is fresh, it may omit to send the body. However there are two problems here: 1) It is not clear if WSGI explicitly allows an implementation to skip the iteration over the app_iter object, for optimization purpose 2) For a WSGI implementation embedded in an existing webserver, the most convenient method to perform cache validation is to let the server do it; however this requires to send the headers as soon as start_response is called, and this is not allowed. Regards Manlio Perillo From t.broyer at gmail.com Mon Oct 8 19:49:06 2007 From: t.broyer at gmail.com (Thomas Broyer) Date: Mon, 8 Oct 2007 19:49:06 +0200 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <470A59DC.1060905@libero.it> References: <4703ADE1.5040507@libero.it> <20071003165020.23FAA3A407A@sparrow.telecommunity.com> <4703CB72.6080308@libero.it> <20071003175813.7DCEA3A407A@sparrow.telecommunity.com> <470A59DC.1060905@libero.it> Message-ID: 2007/10/8, Manlio Perillo: > Phillip J. Eby ha scritto: > > [...] > > > > I don't think there's any point to having a WSGI extension for If-* > > header support. > > I have just found that the WSGI spec says: > """...it should be clear that a server may handle cache validation via > the If-None-Match and If-Modified-Since request headers and the > Last-Modified and ETag response headers.""" > > > So a WSGI implementation is *allowed* to perform cache validation, but > it is not clear *how* this should be done. > > As an example, without the need of an extension, the start_response > callable may check if Last-Modified or ETag is in the headers. > In this case, it may perform a cache validation, and if the client > representation is fresh, it may omit to send the body. > > However there are two problems here: > 1) It is not clear if WSGI explicitly allows an implementation to skip > the iteration over the app_iter object, for optimization purpose > 2) For a WSGI implementation embedded in an existing webserver, the > most convenient method to perform cache validation is to let the > server do it; however this requires to send the headers as soon as > start_response is called, and this is not allowed. How about (not tested, and simplified to require the app to return an iterable, and without support for If-Range): def has_precondition(environ): return "HTTP_IF_MATCH" in environ or "HTTP_IF_NONE_MATCH" in environ or "HTTP_IF_MODIFIED_SINCE" in environ or "HTTP_IF_UNMODIFIED_SINCE" in environ def matches_preconditions(environ, headers): # TODO def notmodifed_middleware(application): def middleware(environ, start_response): notmodified = [False] def sr(status, headers, exc_info=None): if status[0] == "2" and matches_preconditions(environ, headers): start_response("304 Not Modified", headers, exc_info) notmodified[0] = True return lambda s: raise NotSupportedError("The write callback is deprecated") else: notmodified[0] = False return start_response(status, headers, exc_info) app_iter = application(environ, environ["wsgi.method"] == "GET" and has_preconditions(environ) and sr or start_response) if notmodified[0]: return ("", ) else: return app_iter return middleware We're still waiting for the app to complete (and return its app_iter) before sending anything to the client but this doesn't prevent us from checking preconditions and in this case replace the status with a 304 Not Modified and an empty body (ignoring the app_iter all together; but maybe we should iterate it to allow the wrapped application to *really* complete its execution) -- Thomas Broyer From t.broyer at gmail.com Mon Oct 8 19:51:14 2007 From: t.broyer at gmail.com (Thomas Broyer) Date: Mon, 8 Oct 2007 19:51:14 +0200 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <470A59DC.1060905@libero.it> References: <4703ADE1.5040507@libero.it> <20071003165020.23FAA3A407A@sparrow.telecommunity.com> <4703CB72.6080308@libero.it> <20071003175813.7DCEA3A407A@sparrow.telecommunity.com> <470A59DC.1060905@libero.it> Message-ID: 2007/10/8, Manlio Perillo: > However there are two problems here: > 1) It is not clear if WSGI explicitly allows an implementation to skip > the iteration over the app_iter object, for optimization purpose > 2) For a WSGI implementation embedded in an existing webserver, the > most convenient method to perform cache validation is to let the > server do it; however this requires to send the headers as soon as > start_response is called, and this is not allowed. Oops, sorry, hadn't correctly understood what you were saying. Of course you're right here. -- Thomas Broyer From manlio_perillo at libero.it Mon Oct 8 20:19:57 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Mon, 08 Oct 2007 20:19:57 +0200 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: References: <4703ADE1.5040507@libero.it> <20071003165020.23FAA3A407A@sparrow.telecommunity.com> <4703CB72.6080308@libero.it> <20071003175813.7DCEA3A407A@sparrow.telecommunity.com> <470A59DC.1060905@libero.it> Message-ID: <470A74CD.3090602@libero.it> Thomas Broyer ha scritto: > 2007/10/8, Manlio Perillo: >> However there are two problems here: >> 1) It is not clear if WSGI explicitly allows an implementation to skip >> the iteration over the app_iter object, for optimization purpose >> 2) For a WSGI implementation embedded in an existing webserver, the >> most convenient method to perform cache validation is to let the >> server do it; however this requires to send the headers as soon as >> start_response is called, and this is not allowed. > > Oops, sorry, hadn't correctly understood what you were saying. Of > course you're right here. > A precisation: this is only an optimization. Nginx will always do the cache validation (if the appropriate header filter is enabled) and will discard the body if the cliend has a fresh copy. The same applies to If-Range, but in this case it is not possible to optimize the WSGI application execution. Regards Manlio Perillo From pje at telecommunity.com Mon Oct 8 21:32:48 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 08 Oct 2007 15:32:48 -0400 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <470A59DC.1060905@libero.it> References: <4703ADE1.5040507@libero.it> <20071003165020.23FAA3A407A@sparrow.telecommunity.com> <4703CB72.6080308@libero.it> <20071003175813.7DCEA3A407A@sparrow.telecommunity.com> <470A59DC.1060905@libero.it> Message-ID: <20071008193012.4213D3A407A@sparrow.telecommunity.com> At 06:25 PM 10/8/2007 +0200, Manlio Perillo wrote: >Phillip J. Eby ha scritto: > > [...] > > > > I don't think there's any point to having a WSGI extension for If-* > > header support. > >I have just found that the WSGI spec says: >"""...it should be clear that a server may handle cache validation via >the If-None-Match and If-Modified-Since request headers and the >Last-Modified and ETag response headers.""" > > >So a WSGI implementation is *allowed* to perform cache validation, but >it is not clear *how* this should be done. > >As an example, without the need of an extension, the start_response >callable may check if Last-Modified or ETag is in the headers. >In this case, it may perform a cache validation, and if the client >representation is fresh, it may omit to send the body. > >However there are two problems here: >1) It is not clear if WSGI explicitly allows an implementation to skip > the iteration over the app_iter object, for optimization purpose >2) For a WSGI implementation embedded in an existing webserver, the > most convenient method to perform cache validation is to let the > server do it; however this requires to send the headers as soon as > start_response is called, and this is not allowed. The only time that the headers can be changed is if there is an error during the generation of the body content. So, the fact that send_headers() is called with a matching ETag or Last-Modified, is sufficient to determine that the request may be handled using a cache. You are correct that the PEP does not explicitly allow the iteration to be skipped. My thought is that it should indeed allow it, as long as the close() method (if any) is still called, and so long as the request method was a GET. With that clarification added to the existing spec, I think it should be possible to implement cache validation in a server. Hopefully, if anybody knows of a reason why this clarification should *not* be added to the spec, they will speak up now. :) From graham.dumpleton at gmail.com Tue Oct 9 00:23:53 2007 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Tue, 9 Oct 2007 08:23:53 +1000 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <20071008193012.4213D3A407A@sparrow.telecommunity.com> References: <4703ADE1.5040507@libero.it> <20071003165020.23FAA3A407A@sparrow.telecommunity.com> <4703CB72.6080308@libero.it> <20071003175813.7DCEA3A407A@sparrow.telecommunity.com> <470A59DC.1060905@libero.it> <20071008193012.4213D3A407A@sparrow.telecommunity.com> Message-ID: <88e286470710081523q2b245976w7220d7075fb19d6c@mail.gmail.com> On 09/10/2007, Phillip J. Eby wrote: > At 06:25 PM 10/8/2007 +0200, Manlio Perillo wrote: > >Phillip J. Eby ha scritto: > > > [...] > > > > > > I don't think there's any point to having a WSGI extension for If-* > > > header support. > > > >I have just found that the WSGI spec says: > >"""...it should be clear that a server may handle cache validation via > >the If-None-Match and If-Modified-Since request headers and the > >Last-Modified and ETag response headers.""" > > > > > >So a WSGI implementation is *allowed* to perform cache validation, but > >it is not clear *how* this should be done. > > > >As an example, without the need of an extension, the start_response > >callable may check if Last-Modified or ETag is in the headers. > >In this case, it may perform a cache validation, and if the client > >representation is fresh, it may omit to send the body. > > > >However there are two problems here: > >1) It is not clear if WSGI explicitly allows an implementation to skip > > the iteration over the app_iter object, for optimization purpose > >2) For a WSGI implementation embedded in an existing webserver, the > > most convenient method to perform cache validation is to let the > > server do it; however this requires to send the headers as soon as > > start_response is called, and this is not allowed. > > The only time that the headers can be changed is if there is an error > during the generation of the body content. So, the fact that > send_headers() is called with a matching ETag or Last-Modified, is > sufficient to determine that the request may be handled using a cache. > > You are correct that the PEP does not explicitly allow the iteration > to be skipped. My thought is that it should indeed allow it, as long > as the close() method (if any) is still called, and so long as the > request method was a GET. Why only a GET? Just showing my ignorance here and would like it explained. :-) Graham > With that clarification added to the existing spec, I think it should > be possible to implement cache validation in a server. > > Hopefully, if anybody knows of a reason why this clarification should > *not* be added to the spec, they will speak up now. :) From pje at telecommunity.com Tue Oct 9 03:10:50 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 08 Oct 2007 21:10:50 -0400 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <88e286470710081523q2b245976w7220d7075fb19d6c@mail.gmail.co m> References: <4703ADE1.5040507@libero.it> <20071003165020.23FAA3A407A@sparrow.telecommunity.com> <4703CB72.6080308@libero.it> <20071003175813.7DCEA3A407A@sparrow.telecommunity.com> <470A59DC.1060905@libero.it> <20071008193012.4213D3A407A@sparrow.telecommunity.com> <88e286470710081523q2b245976w7220d7075fb19d6c@mail.gmail.com> Message-ID: <20071009011039.249713A40BF@sparrow.telecommunity.com> At 08:23 AM 10/9/2007 +1000, Graham Dumpleton wrote: >On 09/10/2007, Phillip J. Eby wrote: > > At 06:25 PM 10/8/2007 +0200, Manlio Perillo wrote: > > >Phillip J. Eby ha scritto: > > > > [...] > > > > > > > > I don't think there's any point to having a WSGI extension for If-* > > > > header support. > > > > > >I have just found that the WSGI spec says: > > >"""...it should be clear that a server may handle cache validation via > > >the If-None-Match and If-Modified-Since request headers and the > > >Last-Modified and ETag response headers.""" > > > > > > > > >So a WSGI implementation is *allowed* to perform cache validation, but > > >it is not clear *how* this should be done. > > > > > >As an example, without the need of an extension, the start_response > > >callable may check if Last-Modified or ETag is in the headers. > > >In this case, it may perform a cache validation, and if the client > > >representation is fresh, it may omit to send the body. > > > > > >However there are two problems here: > > >1) It is not clear if WSGI explicitly allows an implementation to skip > > > the iteration over the app_iter object, for optimization purpose > > >2) For a WSGI implementation embedded in an existing webserver, the > > > most convenient method to perform cache validation is to let the > > > server do it; however this requires to send the headers as soon as > > > start_response is called, and this is not allowed. > > > > The only time that the headers can be changed is if there is an error > > during the generation of the body content. So, the fact that > > send_headers() is called with a matching ETag or Last-Modified, is > > sufficient to determine that the request may be handled using a cache. > > > > You are correct that the PEP does not explicitly allow the iteration > > to be skipped. My thought is that it should indeed allow it, as long > > as the close() method (if any) is still called, and so long as the > > request method was a GET. > >Why only a GET? > >Just showing my ignorance here and would like it explained. :-) Since GET is supposed to be side effect-free, skipping the calculation of the response body (by not iterating over it) is less likely to cause a problem than with another request method. I guess HEAD would be safe, too. If we were just now defining WSGI 1.0, I would let it be any method and explicitly document that servers doing cache validation or processing a HEAD method can skip iteration of the body, so that you can get better performance. However, if we put this language into WSGI 1.0, I'm wary of breaking stuff that exists in the field; indeed it might be better just to say that it's up to the user to add middleware to do this, rather than trying to get a common standard for how servers should handle it. From graham.dumpleton at gmail.com Tue Oct 9 03:19:42 2007 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Tue, 9 Oct 2007 11:19:42 +1000 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <20071009011039.249713A40BF@sparrow.telecommunity.com> References: <4703ADE1.5040507@libero.it> <20071003165020.23FAA3A407A@sparrow.telecommunity.com> <4703CB72.6080308@libero.it> <20071003175813.7DCEA3A407A@sparrow.telecommunity.com> <470A59DC.1060905@libero.it> <20071008193012.4213D3A407A@sparrow.telecommunity.com> <88e286470710081523q2b245976w7220d7075fb19d6c@mail.gmail.com> <20071009011039.249713A40BF@sparrow.telecommunity.com> Message-ID: <88e286470710081819g10f558d9k3ae6683ccfe30d85@mail.gmail.com> On 09/10/2007, Phillip J. Eby wrote: > At 08:23 AM 10/9/2007 +1000, Graham Dumpleton wrote: > >On 09/10/2007, Phillip J. Eby wrote: > > > At 06:25 PM 10/8/2007 +0200, Manlio Perillo wrote: > > > >Phillip J. Eby ha scritto: > > > > > [...] > > > > > > > > > > I don't think there's any point to having a WSGI extension for If-* > > > > > header support. > > > > > > > >I have just found that the WSGI spec says: > > > >"""...it should be clear that a server may handle cache validation via > > > >the If-None-Match and If-Modified-Since request headers and the > > > >Last-Modified and ETag response headers.""" > > > > > > > > > > > >So a WSGI implementation is *allowed* to perform cache validation, but > > > >it is not clear *how* this should be done. > > > > > > > >As an example, without the need of an extension, the start_response > > > >callable may check if Last-Modified or ETag is in the headers. > > > >In this case, it may perform a cache validation, and if the client > > > >representation is fresh, it may omit to send the body. > > > > > > > >However there are two problems here: > > > >1) It is not clear if WSGI explicitly allows an implementation to skip > > > > the iteration over the app_iter object, for optimization purpose > > > >2) For a WSGI implementation embedded in an existing webserver, the > > > > most convenient method to perform cache validation is to let the > > > > server do it; however this requires to send the headers as soon as > > > > start_response is called, and this is not allowed. > > > > > > The only time that the headers can be changed is if there is an error > > > during the generation of the body content. So, the fact that > > > send_headers() is called with a matching ETag or Last-Modified, is > > > sufficient to determine that the request may be handled using a cache. > > > > > > You are correct that the PEP does not explicitly allow the iteration > > > to be skipped. My thought is that it should indeed allow it, as long > > > as the close() method (if any) is still called, and so long as the > > > request method was a GET. > > > >Why only a GET? > > > >Just showing my ignorance here and would like it explained. :-) > > Since GET is supposed to be side effect-free, skipping the > calculation of the response body (by not iterating over it) is less > likely to cause a problem than with another request method. I guess > HEAD would be safe, too. Except that with the way that people use query strings to a GET instead of a POST with form data in the body, that GET can technically also have a content body, and how people in general abuse the method type, that probably often isn't the case. This is why I was querying the distinction, as not sure one can really say it is different to other methods unless HTTP specifications do indicate as much at least in relation to caching. Caching is an area I have never really looked, so I don't really know what the specifications say so this could all be irrelevant. :-) Graham > If we were just now defining WSGI 1.0, I would let it be any method > and explicitly document that servers doing cache validation or > processing a HEAD method can skip iteration of the body, so that you > can get better performance. > > However, if we put this language into WSGI 1.0, I'm wary of breaking > stuff that exists in the field; indeed it might be better just to say > that it's up to the user to add middleware to do this, rather than > trying to get a common standard for how servers should handle it. > > From t.broyer at gmail.com Tue Oct 9 09:05:01 2007 From: t.broyer at gmail.com (Thomas Broyer) Date: Tue, 9 Oct 2007 09:05:01 +0200 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <88e286470710081819g10f558d9k3ae6683ccfe30d85@mail.gmail.com> References: <4703ADE1.5040507@libero.it> <20071003165020.23FAA3A407A@sparrow.telecommunity.com> <4703CB72.6080308@libero.it> <20071003175813.7DCEA3A407A@sparrow.telecommunity.com> <470A59DC.1060905@libero.it> <20071008193012.4213D3A407A@sparrow.telecommunity.com> <88e286470710081523q2b245976w7220d7075fb19d6c@mail.gmail.com> <20071009011039.249713A40BF@sparrow.telecommunity.com> <88e286470710081819g10f558d9k3ae6683ccfe30d85@mail.gmail.com> Message-ID: 2007/10/9, Graham Dumpleton : > On 09/10/2007, Phillip J. Eby wrote: > > > > Since GET is supposed to be side effect-free, skipping the > > calculation of the response body (by not iterating over it) is less > > likely to cause a problem than with another request method. I guess > > HEAD would be safe, too. > > Except that with the way that people use query strings to a GET > instead of a POST with form data in the body, that GET can technically > also have a content body, and how people in general abuse the method > type, that probably often isn't the case. This is why I was querying > the distinction, as not sure one can really say it is different to > other methods unless HTTP specifications do indicate as much at least > in relation to caching. Caching is an area I have never really looked, > so I don't really know what the specifications say so this could all > be irrelevant. :-) Except that in this case, they probably don't send Last-Modified or ETag headers, or if they do, their value is probably (almost) unique to the request. People abusing GET probably don't care about caching, so they won't plug or enable such middlewares. And even if they'd do, well, it's HTTP: such a middleware isn't much different from a caching proxy/relay. Note also that there are less abuses of GET each day (thanks to Google Web Accelerator pre-fetching which highlighted the problem; and Web 2.0, AJAX and ReST becoming widespread and "educating" web developers) -- Thomas Broyer From manlio_perillo at libero.it Tue Oct 9 10:43:30 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Tue, 09 Oct 2007 10:43:30 +0200 Subject: [Web-SIG] [extension] x-wsgiorg.flush In-Reply-To: <20071009011039.249713A40BF@sparrow.telecommunity.com> References: <4703ADE1.5040507@libero.it> <20071003165020.23FAA3A407A@sparrow.telecommunity.com> <4703CB72.6080308@libero.it> <20071003175813.7DCEA3A407A@sparrow.telecommunity.com> <470A59DC.1060905@libero.it> <20071008193012.4213D3A407A@sparrow.telecommunity.com> <88e286470710081523q2b245976w7220d7075fb19d6c@mail.gmail.com> <20071009011039.249713A40BF@sparrow.telecommunity.com> Message-ID: <470B3F32.9020005@libero.it> Phillip J. Eby ha scritto: > [...] > If we were just now defining WSGI 1.0, I would let it be any method and > explicitly document that servers doing cache validation or processing a > HEAD method can skip iteration of the body, so that you can get better > performance. > > However, if we put this language into WSGI 1.0, I'm wary of breaking > stuff that exists in the field; indeed it might be better just to say > that it's up to the user to add middleware to do this, rather than > trying to get a common standard for how servers should handle it. > You can always publish an addendum or errata to WSGI 1.0, or just WSGI 1.1 Regards Manlio Perillo From manlio_perillo at libero.it Mon Oct 15 17:52:58 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Mon, 15 Oct 2007 17:52:58 +0200 Subject: [Web-SIG] some questions about start_response implementation Message-ID: <47138CDA.80808@libero.it> Hi. I'm implementing the start_response callable for Nginx mod_wsgi and I have a few questions. 1) From the WSGI PEP it seems that an implementation is allowed to *always* raise an exception when start_response is called with a not null exc_info. Is this true? 2) What happens if an application call start_response with an incorrect status line or headers? Should an implementation consider the function "called", so that an application can call it a second time, *without* the exc_info parameter? 3) How many applications/frameworks use the exc_info parameter for start_response? Thanks Manlio Perillo From manlio_perillo at libero.it Mon Oct 15 17:55:45 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Mon, 15 Oct 2007 17:55:45 +0200 Subject: [Web-SIG] some questions about start_response implementation In-Reply-To: <47138CDA.80808@libero.it> References: <47138CDA.80808@libero.it> Message-ID: <47138D81.1090505@libero.it> Manlio Perillo ha scritto: > Hi. > > I'm implementing the start_response callable for Nginx mod_wsgi and I > have a few questions. > > [...] > > 2) What happens if an application call start_response with an incorrect > status line or headers? > > Should an implementation consider the function "called", so that an ^^^^^^ not called > application can call it a second time, *without* the exc_info > parameter? > Manlio Perillo From pje at telecommunity.com Mon Oct 15 18:11:39 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 15 Oct 2007 12:11:39 -0400 Subject: [Web-SIG] some questions about start_response implementation In-Reply-To: <47138CDA.80808@libero.it> References: <47138CDA.80808@libero.it> Message-ID: <20071015160857.219913A40AF@sparrow.telecommunity.com> At 05:52 PM 10/15/2007 +0200, Manlio Perillo wrote: >Hi. > >I'm implementing the start_response callable for Nginx mod_wsgi and I >have a few questions. > >1) From the WSGI PEP it seems that an implementation is allowed to > *always* raise an exception when start_response is called with a not > null exc_info. > > Is this true? Yes - as long as it's the exc_info passed in, i.e.: try: raise exc_info[0], exc_info[1], exc_info[2] finally: del exc_info (this pattern of raising prevents the possibility of a reference cycle passing through the current stack location, keeping lots of objects around longer than necessary) >2) What happens if an application call start_response with an incorrect > status line or headers? > > Should an implementation consider the function "called", so that an > application can call it a second time, *without* the exc_info > parameter? Interesting point. I think it would be compliant either way, though. (I'm skipping your third question because it doesn't matter how many frameworks use exc_info; if you're implementing WSGI 1.0 you have to support it.) From manlio_perillo at libero.it Mon Oct 15 18:21:01 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Mon, 15 Oct 2007 18:21:01 +0200 Subject: [Web-SIG] some questions about start_response implementation In-Reply-To: <20071015160857.219913A40AF@sparrow.telecommunity.com> References: <47138CDA.80808@libero.it> <20071015160857.219913A40AF@sparrow.telecommunity.com> Message-ID: <4713936D.2030001@libero.it> Phillip J. Eby ha scritto: > At 05:52 PM 10/15/2007 +0200, Manlio Perillo wrote: >> Hi. >> >> I'm implementing the start_response callable for Nginx mod_wsgi and I >> have a few questions. >> >> 1) From the WSGI PEP it seems that an implementation is allowed to >> *always* raise an exception when start_response is called with a not >> null exc_info. >> >> Is this true? > > Yes - as long as it's the exc_info passed in, i.e.: It seems that WSGI *does not* requires the application to raise the exc_info passed. > > try: > raise exc_info[0], exc_info[1], exc_info[2] > finally: > del exc_info > > (this pattern of raising prevents the possibility of a reference cycle > passing through the current stack location, keeping lots of objects > around longer than necessary) Is this a concern for an implementation in C, too? > > > >> 2) What happens if an application call start_response with an incorrect >> status line or headers? >> >> Should an implementation consider the function "called", so that an >> application can call it a second time, *without* the exc_info >> parameter? > > Interesting point. I think it would be compliant either way, though. > > (I'm skipping your third question because it doesn't matter how many > frameworks use exc_info; if you're implementing WSGI 1.0 you have to > support it.) > Well, I'm asking because in the current implementation I always raise an exception, thus not allowing an application to "change its mind". Its not a big problem to improve the code, but I can delay it if not really required. Thanks and regards Manlio Perillo From pje at telecommunity.com Mon Oct 15 18:45:45 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 15 Oct 2007 12:45:45 -0400 Subject: [Web-SIG] some questions about start_response implementation In-Reply-To: <4713936D.2030001@libero.it> References: <47138CDA.80808@libero.it> <20071015160857.219913A40AF@sparrow.telecommunity.com> <4713936D.2030001@libero.it> Message-ID: <20071015164302.117163A408F@sparrow.telecommunity.com> At 06:21 PM 10/15/2007 +0200, Manlio Perillo wrote: >Phillip J. Eby ha scritto: > > At 05:52 PM 10/15/2007 +0200, Manlio Perillo wrote: > >> Hi. > >> > >> I'm implementing the start_response callable for Nginx mod_wsgi and I > >> have a few questions. > >> > >> 1) From the WSGI PEP it seems that an implementation is allowed to > >> *always* raise an exception when start_response is called with a not > >> null exc_info. > >> > >> Is this true? > > > > Yes - as long as it's the exc_info passed in, i.e.: > >It seems that WSGI *does not* requires the application to raise the >exc_info passed. We're talking about the *server*, not the application: "if exc_info is provided, and the HTTP headers have already been sent, start_response MUST raise an error, and SHOULD raise the exc_info tuple." So, it's a "should" for the server, with the intent being that you should have some special reason for not doing so. This is later clarified in the PEP as meaning that exception-handling middleware may have reasons to raise an alternative error or not raise an error. However, there aren't any anticipated use cases for server gateways to do anything but raise the passed-in errors. > > > > try: > > raise exc_info[0], exc_info[1], exc_info[2] > > finally: > > del exc_info > > > > (this pattern of raising prevents the possibility of a reference cycle > > passing through the current stack location, keeping lots of objects > > around longer than necessary) > >Is this a concern for an implementation in C, too? No, because local variables in C don't get stored in a Python frame or traceback. The above is only relevant if start_response() is written in Python. >Well, I'm asking because in the current implementation I always raise an >exception, thus not allowing an application to "change its mind". Yeah, it's not required for an application to change its mind and send different non-error headers. I don't think that such an application would be WSGI compliant if it did. From manlio_perillo at libero.it Mon Oct 15 19:04:09 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Mon, 15 Oct 2007 19:04:09 +0200 Subject: [Web-SIG] some questions about start_response implementation In-Reply-To: <20071015164302.117163A408F@sparrow.telecommunity.com> References: <47138CDA.80808@libero.it> <20071015160857.219913A40AF@sparrow.telecommunity.com> <4713936D.2030001@libero.it> <20071015164302.117163A408F@sparrow.telecommunity.com> Message-ID: <47139D89.1050804@libero.it> Phillip J. Eby ha scritto: > At 06:21 PM 10/15/2007 +0200, Manlio Perillo wrote: >> Phillip J. Eby ha scritto: >> > At 05:52 PM 10/15/2007 +0200, Manlio Perillo wrote: >> >> Hi. >> >> >> >> I'm implementing the start_response callable for Nginx mod_wsgi and I >> >> have a few questions. >> >> >> >> 1) From the WSGI PEP it seems that an implementation is allowed to >> >> *always* raise an exception when start_response is called with >> a not >> >> null exc_info. >> >> >> >> Is this true? >> > >> > Yes - as long as it's the exc_info passed in, i.e.: >> >> It seems that WSGI *does not* requires the application to raise the >> exc_info passed. > > We're talking about the *server*, not the application: > Sorry, I have written application, but I meant server :-). > "if exc_info is provided, and the HTTP headers have already been sent, > start_response MUST raise an error, and SHOULD raise the exc_info tuple." > > So, it's a "should" for the server, with the intent being that you > should have some special reason for not doing so. This is later > clarified in the PEP as meaning that exception-handling middleware may > have reasons to raise an alternative error or not raise an error. > However, there aren't any anticipated use cases for server gateways to > do anything but raise the passed-in errors. > Ok, thanks for the clarification. > [...] Regards Manlio Perillo From manlio_perillo at libero.it Mon Oct 15 22:06:08 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Mon, 15 Oct 2007 22:06:08 +0200 Subject: [Web-SIG] some questions about the write callable Message-ID: <4713C830.2010709@libero.it> Hi. The only feature that remains to implement for nginx mod_wsgi is the write callable. The WSGI spec says: """In other words, before write() returns, it must guarantee that the passed-in string was either completely sent to the client, or that it is buffered for transmission while the application proceeds onward.""" With Nginx it can happen that the passed-in string cannot be completely sent to the client, since the socket can returns an EAGAIN. In this case Nginx will buffer the data and it will send the buffer to the client when the socket is ready. This is fully supported by nginx mod_wsgi, when the application returns a generator, since nginx mod_wsgi will suspend the execution of the application until the previous buffer has been entirely written to the client. Unfortunately, this is not possible with the write callable. This means that Nginx will try to send the data to the client, *only* when the write function is called. In other words, the transmission may become stalled if the application blocks and a previous passed-in string is in a nginx buffer. I don't understand why WSGI explicitly says '*must not* delay', instead of a 'should not delay'. There is another, more interesting, problem, however. As far as I can understand, WSGI does not explicitly forbids an application to call the write callable from a separate thread. This means that, in theory, this is allowed. Is this true? How many applications, if any, do this? Since Nginx is not thread safe, this *cannot* be supported, really. If a new WSGI 1.1 spec is going to be released, I hope that it will be more friendly with asynchronous servers without threads support. Thanks Manlio Perillo From manlio_perillo at libero.it Mon Oct 15 23:25:06 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Mon, 15 Oct 2007 23:25:06 +0200 Subject: [Web-SIG] some questions about the write callable In-Reply-To: <4713C830.2010709@libero.it> References: <4713C830.2010709@libero.it> Message-ID: <4713DAB2.3050500@libero.it> Manlio Perillo ha scritto: > Hi. > > The only feature that remains to implement for nginx mod_wsgi is the > write callable. > > The WSGI spec says: > """In other words, before write() returns, it must guarantee that the > passed-in string was either completely sent to the client, or that it is > buffered for transmission while the application proceeds onward.""" > > > With Nginx it can happen that the passed-in string cannot be completely > sent to the client, since the socket can returns an EAGAIN. > > In this case Nginx will buffer the data and it will send the buffer to > the client when the socket is ready. > A correction. Nginx will not buffer the data, it will ignore successive write requests. The buffering must be done by the application. For the moment I will raise an exception when the data cannot be completely written to the client (IMHO this does not forbidden the WSGI spec, but, of course, it is not very useful). Regards Manlio Perillo From pje at telecommunity.com Tue Oct 16 00:27:49 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon, 15 Oct 2007 18:27:49 -0400 Subject: [Web-SIG] some questions about the write callable In-Reply-To: <4713C830.2010709@libero.it> References: <4713C830.2010709@libero.it> Message-ID: <20071015222512.EC0B73A408F@sparrow.telecommunity.com> At 10:06 PM 10/15/2007 +0200, Manlio Perillo wrote: >Hi. > >The only feature that remains to implement for nginx mod_wsgi is the >write callable. > >The WSGI spec says: >"""In other words, before write() returns, it must guarantee that the >passed-in string was either completely sent to the client, or that it is >buffered for transmission while the application proceeds onward.""" > > >With Nginx it can happen that the passed-in string cannot be completely >sent to the client, since the socket can returns an EAGAIN. In which case, your write() implementation will need to loop until all the data hits the OS-level buffers. >In this case Nginx will buffer the data and it will send the buffer to >the client when the socket is ready. Note that the two choices are: 1. data is completely sent to the client 2. data is held in a buffer *such that transmission will continue while the app runs* Buffering the data but not sending it while the application continues executing, is not a conformant option. >I don't understand why WSGI explicitly says '*must not* delay', instead >of a 'should not delay'. Because the only reason for having write() or iteration blocks (vs sending a single giant string) is to support interleaving the client communication and some other computation, communication, or I/O. Delay would negate the point of having the ability to stream in the first place. >As far as I can understand, WSGI does not explicitly forbids an >application to call the write callable from a separate thread. >This means that, in theory, this is allowed. In theory, yes. In practice, we intended to document some thread-affinity restrictions, and I do not believe that anybody is trying to call write() from another thread. >If a new WSGI 1.1 spec is going to be released, I hope that it will be >more friendly with asynchronous servers without threads support. Well, I hope that the *documentation* will be more friendly for implementing gateways for such servers. It's doubtful that the actual execution model would change much. From manlio_perillo at libero.it Tue Oct 16 12:15:16 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Tue, 16 Oct 2007 12:15:16 +0200 Subject: [Web-SIG] some questions about the write callable In-Reply-To: <20071015222512.EC0B73A408F@sparrow.telecommunity.com> References: <4713C830.2010709@libero.it> <20071015222512.EC0B73A408F@sparrow.telecommunity.com> Message-ID: <47148F34.1020509@libero.it> Phillip J. Eby ha scritto: > At 10:06 PM 10/15/2007 +0200, Manlio Perillo wrote: >> Hi. >> >> The only feature that remains to implement for nginx mod_wsgi is the >> write callable. >> >> The WSGI spec says: >> """In other words, before write() returns, it must guarantee that the >> passed-in string was either completely sent to the client, or that it is >> buffered for transmission while the application proceeds onward.""" >> >> >> With Nginx it can happen that the passed-in string cannot be completely >> sent to the client, since the socket can returns an EAGAIN. > > In which case, your write() implementation will need to loop until all > the data hits the OS-level buffers. > It seems that this is not possible with Nginx, but I will investigate this problem better, since it is the best solution. > >> In this case Nginx will buffer the data and it will send the buffer to >> the client when the socket is ready. > > Note that the two choices are: > > 1. data is completely sent to the client > 2. data is held in a buffer *such that transmission will continue while > the app runs* > > Buffering the data but not sending it while the application continues > executing, is not a conformant option. > > >> I don't understand why WSGI explicitly says '*must not* delay', instead >> of a 'should not delay'. > > Because the only reason for having write() or iteration blocks (vs > sending a single giant string) is to support interleaving the client > communication and some other computation, communication, or I/O. > > Delay would negate the point of having the ability to stream in the > first place. > You are right, but this is only required by a "real" streaming application (one that does not have an "end"). Even if an application need to serve, as an example, a file of about 100 MB, buffering should not be a problem (and the Nginx buffering model is efficient). I'm not even sure if HTTP 1.1 allows an "infinite" stream. > >> As far as I can understand, WSGI does not explicitly forbids an >> application to call the write callable from a separate thread. >> This means that, in theory, this is allowed. > > In theory, yes. In practice, we intended to document some > thread-affinity restrictions, and I do not believe that anybody is > trying to call write() from another thread. > > >> If a new WSGI 1.1 spec is going to be released, I hope that it will be >> more friendly with asynchronous servers without threads support. > > Well, I hope that the *documentation* will be more friendly for > implementing gateways for such servers. It's doubtful that the actual > execution model would change much. > Ok, thanks Manlio Perillo From manlio_perillo at libero.it Tue Oct 16 17:42:59 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Tue, 16 Oct 2007 17:42:59 +0200 Subject: [Web-SIG] [extension] wsgi.info Message-ID: <4714DC03.7060003@libero.it> Hi. I find it strange that the WSGI environ dictionary contains no information about some "details" of the implementation. I think it would be useful to have a wsgi.info variable that returns a tuple with two strings: - the name of the implementation - the version of the implementation Example: wsgi.info = ('nginx mod_wsgi', '0.0.4') Manlio Perillo From ianb at colorstudy.com Tue Oct 16 18:10:23 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 16 Oct 2007 11:10:23 -0500 Subject: [Web-SIG] [extension] wsgi.info In-Reply-To: <4714DC03.7060003@libero.it> References: <4714DC03.7060003@libero.it> Message-ID: <4714E26F.8080606@colorstudy.com> Manlio Perillo wrote: > Hi. > > I find it strange that the WSGI environ dictionary contains no > information about some "details" of the implementation. > > I think it would be useful to have a wsgi.info variable that returns a > tuple with two strings: > - the name of the implementation > - the version of the implementation > > Example: > wsgi.info = ('nginx mod_wsgi', '0.0.4') The details of what implementation? The server? The thing that called the app? The thing that called the app and the thing that called it? OTOH, there's a SERVER_SOFTWARE CGI variable, I believe. -- Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org : Write code, do good : http://topp.openplans.org/careers From manlio_perillo at libero.it Tue Oct 16 18:52:58 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Tue, 16 Oct 2007 18:52:58 +0200 Subject: [Web-SIG] [extension] wsgi.info In-Reply-To: <4714E26F.8080606@colorstudy.com> References: <4714DC03.7060003@libero.it> <4714E26F.8080606@colorstudy.com> Message-ID: <4714EC6A.1040903@libero.it> Ian Bicking ha scritto: > Manlio Perillo wrote: >> Hi. >> >> I find it strange that the WSGI environ dictionary contains no >> information about some "details" of the implementation. >> >> I think it would be useful to have a wsgi.info variable that returns a >> tuple with two strings: >> - the name of the implementation >> - the version of the implementation >> >> Example: >> wsgi.info = ('nginx mod_wsgi', '0.0.4') > > The details of what implementation? The server? The thing that called > the app? The WSGI gateway. > The thing that called the app and the thing that called it? > The former. > OTOH, there's a SERVER_SOFTWARE CGI variable, I believe. > But this refers to the HTTP server. Regards Manlio Perillo From MDiPierro at cti.depaul.edu Wed Oct 17 06:24:35 2007 From: MDiPierro at cti.depaul.edu (Massimo Di Pierro) Date: Tue, 16 Oct 2007 23:24:35 -0500 Subject: [Web-SIG] Gluon 1.6 Message-ID: I have a new version of Gluon out (known bugs fixed) and a video http://www.youtube.com/watch?v=VBjja6N6IYk Thank you to those who expressed interest. I would like to stress that this is a open source project released under GPL2 and I could really use community input to make it better (for example I did not have time to test it with mod_wsgi, I use paste httpserver). I say this since the project wikipedia page has been shut down, claiming this is a commercial product, which is not. Massimo From graham.dumpleton at gmail.com Wed Oct 17 07:01:57 2007 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Wed, 17 Oct 2007 15:01:57 +1000 Subject: [Web-SIG] Gluon 1.6 In-Reply-To: References: Message-ID: <88e286470710162201g8046628h7cba1df95aee6605@mail.gmail.com> Helps if you send a URL for the Gluon web site rather than a YouTube video. BTW, if it is under the GPL why don't you clearly mention that on the web site front page. I can't see a reference to GPL or even a link to a page describing licence used on the front page. Can't seem to see anything in the FAQ either about the licence used. Graham On 17/10/2007, Massimo Di Pierro wrote: > I have a new version of Gluon out (known bugs fixed) and a video > > http://www.youtube.com/watch?v=VBjja6N6IYk > > Thank you to those who expressed interest. > I would like to stress that this is a open source project released > under GPL2 and I could really use community input to make it better > (for example I did not have time to test it with mod_wsgi, I use > paste httpserver). I say this since the project wikipedia page has > been shut down, claiming this is a commercial product, which is not. > > Massimo > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com > From MDiPierro at cti.depaul.edu Wed Oct 17 16:29:35 2007 From: MDiPierro at cti.depaul.edu (Massimo Di Pierro) Date: Wed, 17 Oct 2007 09:29:35 -0500 Subject: [Web-SIG] Gluon 1.6 In-Reply-To: <88e286470710162201g8046628h7cba1df95aee6605@mail.gmail.com> References: <88e286470710162201g8046628h7cba1df95aee6605@mail.gmail.com> Message-ID: <62BCE0FA-F3CC-4638-8B68-1DD53AF189AA@cti.depaul.edu> Good point. Just did that (the license is in the code anyway). The url is http://mdp.cti.depaul.edu/examples Thank you Graham. Massimo On Oct 17, 2007, at 12:01 AM, Graham Dumpleton wrote: > Helps if you send a URL for the Gluon web site rather than a > YouTube video. > > BTW, if it is under the GPL why don't you clearly mention that on the > web site front page. I can't see a reference to GPL or even a link to > a page describing licence used on the front page. Can't seem to see > anything in the FAQ either about the licence used. > > Graham > > On 17/10/2007, Massimo Di Pierro wrote: >> I have a new version of Gluon out (known bugs fixed) and a video >> >> http://www.youtube.com/watch?v=VBjja6N6IYk >> >> Thank you to those who expressed interest. >> I would like to stress that this is a open source project released >> under GPL2 and I could really use community input to make it better >> (for example I did not have time to test it with mod_wsgi, I use >> paste httpserver). I say this since the project wikipedia page has >> been shut down, claiming this is a commercial product, which is not. >> >> Massimo >> _______________________________________________ >> Web-SIG mailing list >> Web-SIG at python.org >> Web SIG: http://www.python.org/sigs/web-sig >> Unsubscribe: http://mail.python.org/mailman/options/web-sig/ >> graham.dumpleton%40gmail.com >> From manlio_perillo at libero.it Fri Oct 19 15:14:35 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Fri, 19 Oct 2007 15:14:35 +0200 Subject: [Web-SIG] about the status line in WSGI Message-ID: <4718ADBB.6030804@libero.it> Is a WSGI gateway allowed to ignore the Reason-Phrase part of the status line returned by the WSGI application, and to use a server defined phrase? Thanks and regards Manlio Perillo From fumanchu at aminus.org Fri Oct 19 17:14:21 2007 From: fumanchu at aminus.org (Robert Brewer) Date: Fri, 19 Oct 2007 08:14:21 -0700 Subject: [Web-SIG] about the status line in WSGI In-Reply-To: <4718ADBB.6030804@libero.it> References: <4718ADBB.6030804@libero.it> Message-ID: Manlio Perillo wrote: > Is a WSGI gateway allowed to ignore the Reason-Phrase part of the > status line returned by the WSGI application, and to use a server > defined phrase? I would be sad if a WSGI gateway did that to me. Why deny a web application developer the right to control that part of the output? Robert Brewer fumanchu at aminus.org No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.488 / Virus Database: 269.15.0/1077 - Release Date: 10/18/2007 9:54 AM From manlio_perillo at libero.it Fri Oct 19 20:42:05 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Fri, 19 Oct 2007 20:42:05 +0200 Subject: [Web-SIG] about the status line in WSGI In-Reply-To: References: <4718ADBB.6030804@libero.it> Message-ID: <4718FA7D.1030202@libero.it> Robert Brewer ha scritto: > Manlio Perillo wrote: >> Is a WSGI gateway allowed to ignore the Reason-Phrase part of the >> status line returned by the WSGI application, and to use a server >> defined phrase? > > I would be sad if a WSGI gateway did that to me. > Why deny a web > application developer the right to control that part of the output? > The WSGI spec requires a full status line as a simplification for the WSGI Gateway and not to give more control to WSGI applications. Regards Manlio Perillo From manlio_perillo at libero.it Fri Oct 19 20:55:32 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Fri, 19 Oct 2007 20:55:32 +0200 Subject: [Web-SIG] about Py[Type]_Check in a WSGI implementation Message-ID: <4718FDA4.9080809@libero.it> The WSGI spec requires the response headers and sequence items to be, respectively, List of Tuples and Strings. However only for the response headers it explicitly requires them to be a Python List, i.e type(response_headers) is ListType. What about the other objects? In the current implementation of WSGI for Nginx I always use Py[Type]_Check, and not Py[Type]_CheckExact. Thanks and regards Manlio Perillo From ianb at colorstudy.com Fri Oct 19 21:02:31 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Fri, 19 Oct 2007 14:02:31 -0500 Subject: [Web-SIG] about Py[Type]_Check in a WSGI implementation In-Reply-To: <4718FDA4.9080809@libero.it> References: <4718FDA4.9080809@libero.it> Message-ID: <4718FF47.40504@colorstudy.com> Manlio Perillo wrote: > The WSGI spec requires the response headers and sequence items to be, > respectively, List of Tuples and Strings. > > However only for the response headers it explicitly requires them to be > a Python List, i.e type(response_headers) is ListType. > > What about the other objects? > > In the current implementation of WSGI for Nginx I always use > Py[Type]_Check, and not Py[Type]_CheckExact. All of the types are required to be exactly as defined, not subclasses or None. But servers are not required to actually test this. wsgiref.validate does test for exactly these types, but it's acceptable for Nginx to just access the data without checking its exact type. -- Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org From manlio_perillo at libero.it Fri Oct 19 21:43:39 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Fri, 19 Oct 2007 21:43:39 +0200 Subject: [Web-SIG] about Py[Type]_Check in a WSGI implementation In-Reply-To: <4718FF47.40504@colorstudy.com> References: <4718FDA4.9080809@libero.it> <4718FF47.40504@colorstudy.com> Message-ID: <471908EB.6050509@libero.it> Ian Bicking ha scritto: > Manlio Perillo wrote: >> The WSGI spec requires the response headers and sequence items to be, >> respectively, List of Tuples and Strings. >> >> However only for the response headers it explicitly requires them to >> be a Python List, i.e type(response_headers) is ListType. >> >> What about the other objects? >> >> In the current implementation of WSGI for Nginx I always use >> Py[Type]_Check, and not Py[Type]_CheckExact. > > All of the types are required to be exactly as defined, not subclasses > or None. But servers are not required to actually test this. > wsgiref.validate does test for exactly these types, but it's acceptable > for Nginx to just access the data without checking its exact type. > Ok, thanks. However it is not a problem to use Py[Type]_Check instead of Py[Type]_CheckExact (and it should not be slower), so if the types are required to be exactly as defined I think it is better to do the exact check. In mod_wsgi for Nginx I'm doing a lot of checks (as an example I even check if the write callable is called from within application iterable) Manlio Perillo From graham.dumpleton at gmail.com Sat Oct 20 11:24:56 2007 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Sat, 20 Oct 2007 19:24:56 +1000 Subject: [Web-SIG] about the status line in WSGI In-Reply-To: <4718FA7D.1030202@libero.it> References: <4718ADBB.6030804@libero.it> <4718FA7D.1030202@libero.it> Message-ID: <88e286470710200224m6e799d73jc6f72d6c93e072ef@mail.gmail.com> FWIW, I have seen people want to use (mod_python didn't support it though), the description associated with a status so they could use different values for a 200 response as part of some strange web application testing framework. Graham On 20/10/2007, Manlio Perillo wrote: > Robert Brewer ha scritto: > > Manlio Perillo wrote: > >> Is a WSGI gateway allowed to ignore the Reason-Phrase part of the > >> status line returned by the WSGI application, and to use a server > >> defined phrase? > > > > I would be sad if a WSGI gateway did that to me. > > Why deny a web > > application developer the right to control that part of the output? > > > > The WSGI spec requires a full status line as a simplification for the > WSGI Gateway and not to give more control to WSGI applications. > > > Regards Manlio Perillo > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com > From wilk at flibuste.net Mon Oct 22 18:09:55 2007 From: wilk at flibuste.net (William Dode) Date: Mon, 22 Oct 2007 16:09:55 +0000 (UTC) Subject: [Web-SIG] WebOb References: <46C3BCB9.6010708@colorstudy.com> Message-ID: Hi, Since the announce of ian about webob, i did two things with it. First i include it in my personal web framework, it was very easy, i had just to remove all my crappy equivalent functions. It make my framework a little bit more clean and i can inherit new features. Second, most important, i wanted to start a little project without any framework to minimize the dependencies. So i started from scratch only with WebOb, the wsgiref server and a part of the example in routing_args specifications. It did it very quickly and the result should be compatible with any wsgi compliant pieces. So, don't you think web-sig should officialy support such library ? Include it in the lib stantard or in a wsgiorg library ? Waiting for your view... -- William Dod? - http://flibuste.net Informaticien ind?pendant I've hard to write in english language... please don't hesitate to give me somes advices in private ! From manlio_perillo at libero.it Mon Oct 22 18:47:52 2007 From: manlio_perillo at libero.it (Manlio Perillo) Date: Mon, 22 Oct 2007 18:47:52 +0200 Subject: [Web-SIG] WebOb In-Reply-To: References: <46C3BCB9.6010708@colorstudy.com> Message-ID: <471CD438.6080208@libero.it> William Dode ha scritto: > Hi, > > Since the announce of ian about webob, i did two things with it. > > First i include it in my personal web framework, it was very easy, i had > just to remove all my crappy equivalent functions. It make my framework > a little bit more clean and i can inherit new features. > > Second, most important, i wanted to start a little project without any > framework to minimize the dependencies. So i started from scratch only > with WebOb, the wsgiref server and a part of the example in routing_args > specifications. It did it very quickly and the result should be > compatible with any wsgi compliant pieces. > > So, don't you think web-sig should officialy support such library ? > Include it in the lib stantard or in a wsgiorg library ? > > Waiting for your view... > I think that, first of all, we should standardize the utility functions for headers handling (parsing and serializing). Regards Manlio Perillo From fdrake at gmail.com Mon Oct 22 18:58:44 2007 From: fdrake at gmail.com (Fred Drake) Date: Mon, 22 Oct 2007 12:58:44 -0400 Subject: [Web-SIG] WebOb In-Reply-To: References: <46C3BCB9.6010708@colorstudy.com> Message-ID: <9cee7ab80710220958k79e0d77do34c55e8218a40889@mail.gmail.com> On 10/22/07, William Dode wrote: > So, don't you think web-sig should officialy support such library ? > Include it in the lib stantard or in a wsgiorg library ? I'm strongly against adding more non-Python-runtime batteries to the standard library. The plethora of packages already there makes updating individual libraries to get bug fixes or features quite painful. This has nothing to do with WebOb in particular; I've not had a chance to look at that yet. -Fred -- Fred L. Drake, Jr. "Chaos is the score upon which reality is written." --Henry Miller From guido at python.org Mon Oct 22 19:01:52 2007 From: guido at python.org (Guido van Rossum) Date: Mon, 22 Oct 2007 10:01:52 -0700 Subject: [Web-SIG] WebOb In-Reply-To: <46C3BCB9.6010708@colorstudy.com> References: <46C3BCB9.6010708@colorstudy.com> Message-ID: 2007/8/15, Ian Bicking : > Lately I got on a kick and extracted/refined/reimplemented a bunch of > stuff from Paste. The result is the not-quite-released WebOb (I don't > want to do a release until I think people should use it instead of > Paste, to the degree the two overlap -- and it's not *quite* ready for > that). Cool. I already heard in the grapevibe about webob.py. > Anyway, I'd be interested in feedback. We've talked a little about a > shared request object -- only a little, and I don't know if it is really > a realistic goal to even try. But I think this request object is a > considerably higher quality than any other request objects out there. > The response object provides a nice symmetry, as well as facilitating > testing. And it's also a very nice response object. I may be totally behind the times here, but I've always found it odd to have separate request and response objects -- the functionalities or APIs don't really overlap, so why not have a single object? I'm really asking to be educated; I severely hope there's a better reason than "Java did it this way". :-) > They are both fairly reasonable to subclass, if there are minor naming > issues (if there's really missing features, I'd like to add them > directly -- though for the response object in particular it's likely > you'll want to subclass to give application defaults, like a default > content type). > > It's based strictly on WSGI, with the request object an almost-stateless > wrapper around a WSGI environment, and the response object a WSGI > application that contains mutable status/headers/app_iter. > > Almost all the defined HTTP headers are available as attributes on the > request and/or response. I try to parse these in as sensible a way as > possible, e.g., req.if_modified_since is a datetime object (of course > unparsed access is also available). Several objects like > response.cache_control are a bit more complex, since there's no data > structure that exactly represents them. I've tried to make them as easy > to use as possible for realistic web tasks. I'm interesting in something that's as lightweight as possible. Are there things that take a reasonable time to parse that could be put off until first use? Perhaps using properties to keep the simplest possible API (or perhaps not to emphasize the cost of first use)? > I'm very interested to get any feedback, especially right now when there > are no backward compatibility concerns. Right now no critique is too > large or small. > > It's in svn at: > http://svn.pythonpaste.org/Paste/WebOb/trunk > > And there are fairly complete docs at: > http://pythonpaste.org/webob/ I briefly looked at the tutorial and was put off a little by the interactive prompt style of the examples; that seems so unrealistic that I wonder if it wouldn't be better to just say "put this in a file and run it like this"? > A quick summary of differences in the API and some other > request/response objects out there: > http://pythonpaste.org/webob/differences.html > I'd include more frameworks, if you can point me to their > request/response API documentation (e.g., I looked but couldn't find any > for Zope 3). I'm not too familiar with other frameworks (having always hacked my own, as it's so easy :-). Any chance of a summary that's not a tutorial nor a reference? > WebOb has a lot more methods and attributes than other libraries, but > this document points out only things where there are differing names or > things not in WebOb. Most other such objects also don't have the same > WSGI-oriented scope (with the exception of Yaro and paste.wsgiwrappers). > > The Request and Response API (extracted docs): > http://pythonpaste.org/webob/class-webob.Request.html > http://pythonpaste.org/webob/class-webob.Response.html -- --Guido van Rossum (home page: http://www.python.org/~guido/) From adam at atlas.st Mon Oct 22 19:54:51 2007 From: adam at atlas.st (Adam Atlas) Date: Mon, 22 Oct 2007 13:54:51 -0400 Subject: [Web-SIG] WebOb In-Reply-To: References: <46C3BCB9.6010708@colorstudy.com> Message-ID: <83E2A065-9529-4E13-8ED6-13C7725879AC@atlas.st> On 22 Oct 2007, at 12:09, William Dode wrote: > So, don't you think web-sig should officialy support such library ? > Include it in the lib stantard or in a wsgiorg library ? > I don't really like the idea of having something like this be part of the standard library; it's sort of neither here nor there between low- level WSGI and framework territory. I don't see people using something like WebOb to write their applications directly (nor does that seem to be the intention); just like Paste, it seems more like something that full frameworks would incorporate and provide access to. Given the principle of "there should be one, and preferably only one, obvious way to do it", it seems like putting this in the standard library would be an endorsement of it as the obvious/best way, and although I like the WebOb approach, I don't think there's enough of a consensus to bless it thus. For now, the multitude of web frameworks and their various philosophies is a good thing. From tseaver at palladion.com Mon Oct 22 19:29:17 2007 From: tseaver at palladion.com (Tres Seaver) Date: Mon, 22 Oct 2007 13:29:17 -0400 Subject: [Web-SIG] WebOb In-Reply-To: References: <46C3BCB9.6010708@colorstudy.com> Message-ID: <471CDDED.9010208@palladion.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Guido van Rossum wrote: > Cool. I already heard in the grapevibe about webob.py. > >> Anyway, I'd be interested in feedback. We've talked a little about a >> shared request object -- only a little, and I don't know if it is really >> a realistic goal to even try. But I think this request object is a >> considerably higher quality than any other request objects out there. >> The response object provides a nice symmetry, as well as facilitating >> testing. And it's also a very nice response object. > > I may be totally behind the times here, but I've always found it odd > to have separate request and response objects -- the functionalities > or APIs don't really overlap, so why not have a single object? I'm > really asking to be educated; I severely hope there's a better reason > than "Java did it this way". :-) HTTP has both headers and payload supplied by the client and returned by the server: not mixing them up is probably the driving reason for keeping separate objects. Of course, you could make one object with 'request' and 'response' attributes, but that wouldn't really be a simplification. >> They are both fairly reasonable to subclass, if there are minor naming >> issues (if there's really missing features, I'd like to add them >> directly -- though for the response object in particular it's likely >> you'll want to subclass to give application defaults, like a default >> content type). >> >> It's based strictly on WSGI, with the request object an almost-stateless >> wrapper around a WSGI environment, and the response object a WSGI >> application that contains mutable status/headers/app_iter. >> >> Almost all the defined HTTP headers are available as attributes on the >> request and/or response. I try to parse these in as sensible a way as >> possible, e.g., req.if_modified_since is a datetime object (of course >> unparsed access is also available). Several objects like >> response.cache_control are a bit more complex, since there's no data >> structure that exactly represents them. I've tried to make them as easy >> to use as possible for realistic web tasks. > > I'm interesting in something that's as lightweight as possible. Are > there things that take a reasonable time to parse that could be put > off until first use? Perhaps using properties to keep the simplest > possible API (or perhaps not to emphasize the cost of first use)? The only big parsing load is going to be the request payload; processing top-level request headers is normally trivial, performance-wise. I read Ian's concern as being about an API for setting / updating cache-control response headers[1], because he found no natural mapping for them as Python primitives. [1] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9 Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHHN3s+gerLs4ltQ4RAjytAKCNejjJahOz2Q3seKpE4pcRiZ4TCQCgu+J2 FFeSFhO84s9n25M2p3d0VWQ= =szPr -----END PGP SIGNATURE----- From guido at python.org Mon Oct 22 21:10:54 2007 From: guido at python.org (Guido van Rossum) Date: Mon, 22 Oct 2007 12:10:54 -0700 Subject: [Web-SIG] WebOb In-Reply-To: <20071022190527.GA15050@smullyan.org> References: <46C3BCB9.6010708@colorstudy.com> <20071022190527.GA15050@smullyan.org> Message-ID: Thanks! I stand educated. 2007/10/22, Jacob Smullyan : > On Mon, Oct 22, 2007 at 10:01:52AM -0700, Guido van Rossum wrote: > > 2007/8/15, Ian Bicking : > > I may be totally behind the times here, but I've always found it odd > > to have separate request and response objects -- the functionalities > > or APIs don't really overlap, so why not have a single object? I'm > > really asking to be educated; I severely hope there's a better reason > > than "Java did it this way". :-) > > I'm hardly in a position to educate you, but here are my two cents. > > The aging but pleasant framework I've used for years, SkunkWeb (which you > are free to think of as the amiable old drunk of the Python web development > world) has always had a single Connection object for that reason. However, > in skunkweb 4, I tossed it away and switched to using WebOb, because, > although I somewhat prefer the aesthetic elegance of having one object > rather than two, that preference is very slight, whereas Webob has many > other advantages -- to my mind it is superbly done and it would be pointless > to rewrite it -- and in fact I made request and response attributes of a > single context object, which I suspect many framework authors would do, so > instead of > > CONNECTION.requestHeaders # SkunkWeb 3 > > I now have > > Context.request.headers # SkunkWeb 4 > > which is fine by me. > > And there are cases when you might want a request or response without really > needing the other. For instance, what would be the point of having WebOb's > HTTPException classes, which are response subclasses, also be requests? And > middleware might not be interested at all in the response -- so why should > they deal with an object larded with response-specific attributes, and > possibly requiring those attributes to undergo initialization? (Well, there > isn't much initialization necessary, I suppose.) Not having to refer to > things at times you you don't care about them is an architectural good which > offsets to some degree the clumsiness of having two closely related things > rather than one when you care about them both. > > > Cheers, > > js > > -- > Jacob Smullyan > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ianb at colorstudy.com Mon Oct 22 21:26:53 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Mon, 22 Oct 2007 14:26:53 -0500 Subject: [Web-SIG] WebOb In-Reply-To: References: <46C3BCB9.6010708@colorstudy.com> Message-ID: <471CF97D.6070808@colorstudy.com> Guido van Rossum wrote: >> Anyway, I'd be interested in feedback. We've talked a little about a >> shared request object -- only a little, and I don't know if it is really >> a realistic goal to even try. But I think this request object is a >> considerably higher quality than any other request objects out there. >> The response object provides a nice symmetry, as well as facilitating >> testing. And it's also a very nice response object. > > I may be totally behind the times here, but I've always found it odd > to have separate request and response objects -- the functionalities > or APIs don't really overlap, so why not have a single object? I'm > really asking to be educated; I severely hope there's a better reason > than "Java did it this way". :-) There are several headers that exist in both the request and the response. For instance, Content-Type, Content-Length, and Cache-Control. Additionally, a lot of headers aren't immediately obvious -- is Location a request or response header? Well, response, but if all the headers are mixed together it takes a bit of thought to realize that. The WebOb request and response are mostly representations of the HTTP messages, and there's two distinct messages which look very similar, which makes them hard to mix into one object. >> They are both fairly reasonable to subclass, if there are minor naming >> issues (if there's really missing features, I'd like to add them >> directly -- though for the response object in particular it's likely >> you'll want to subclass to give application defaults, like a default >> content type). >> >> It's based strictly on WSGI, with the request object an almost-stateless >> wrapper around a WSGI environment, and the response object a WSGI >> application that contains mutable status/headers/app_iter. >> >> Almost all the defined HTTP headers are available as attributes on the >> request and/or response. I try to parse these in as sensible a way as >> possible, e.g., req.if_modified_since is a datetime object (of course >> unparsed access is also available). Several objects like >> response.cache_control are a bit more complex, since there's no data >> structure that exactly represents them. I've tried to make them as easy >> to use as possible for realistic web tasks. > > I'm interesting in something that's as lightweight as possible. Are > there things that take a reasonable time to parse that could be put > off until first use? Perhaps using properties to keep the simplest > possible API (or perhaps not to emphasize the cost of first use)? Almost everything is a property. This is in part because state is kept in the native WSGI forms (environ, status, headers, app_iter), so everything is calculated off of these. It also makes instantiation relatively light. Even the request body is left alone until request.POST is accessed. >> I'm very interested to get any feedback, especially right now when there >> are no backward compatibility concerns. Right now no critique is too >> large or small. >> >> It's in svn at: >> http://svn.pythonpaste.org/Paste/WebOb/trunk >> >> And there are fairly complete docs at: >> http://pythonpaste.org/webob/ > > I briefly looked at the tutorial and was put off a little by the > interactive prompt style of the examples; that seems so unrealistic > that I wonder if it wouldn't be better to just say "put this in a file > and run it like this"? The side effect of doctesting is that docs sometimes look weird :-/ I'm not sure what form the docs should take. I'm open to suggestions. The extracted docs are actually reasonable as a reference, I think: http://pythonpaste.org/webob/class-webob.Request.html http://pythonpaste.org/webob/class-webob.Response.html For realistic use cases, some kind of infrastructure is necessary. I suppose a simple example using the wsgiref server and a plain WSGI app would suffice. Even a very small framework (e.g., http://svn.pythonpaste.org/Paste/apps/FlatAtomPub/trunk/flatatompub/dec.py) improves that considerably, but probably isn't worth introducing. >> A quick summary of differences in the API and some other >> request/response objects out there: >> http://pythonpaste.org/webob/differences.html >> I'd include more frameworks, if you can point me to their >> request/response API documentation (e.g., I looked but couldn't find any >> for Zope 3). > > I'm not too familiar with other frameworks (having always hacked my > own, as it's so easy :-). Any chance of a summary that's not a > tutorial nor a reference? Did you look at the file serving example? http://pythonpaste.org/webob/file-example.html I suppose a quick summary would also be possible, covering just the most important attributes and with a quick listing of others (like all the properties for the individual HTTP headers). -- Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org : Write code, do good : http://topp.openplans.org/careers From smulloni at smullyan.org Mon Oct 22 21:05:27 2007 From: smulloni at smullyan.org (Jacob Smullyan) Date: Mon, 22 Oct 2007 15:05:27 -0400 Subject: [Web-SIG] WebOb In-Reply-To: References: <46C3BCB9.6010708@colorstudy.com> Message-ID: <20071022190527.GA15050@smullyan.org> On Mon, Oct 22, 2007 at 10:01:52AM -0700, Guido van Rossum wrote: > 2007/8/15, Ian Bicking : > I may be totally behind the times here, but I've always found it odd > to have separate request and response objects -- the functionalities > or APIs don't really overlap, so why not have a single object? I'm > really asking to be educated; I severely hope there's a better reason > than "Java did it this way". :-) I'm hardly in a position to educate you, but here are my two cents. The aging but pleasant framework I've used for years, SkunkWeb (which you are free to think of as the amiable old drunk of the Python web development world) has always had a single Connection object for that reason. However, in skunkweb 4, I tossed it away and switched to using WebOb, because, although I somewhat prefer the aesthetic elegance of having one object rather than two, that preference is very slight, whereas Webob has many other advantages -- to my mind it is superbly done and it would be pointless to rewrite it -- and in fact I made request and response attributes of a single context object, which I suspect many framework authors would do, so instead of CONNECTION.requestHeaders # SkunkWeb 3 I now have Context.request.headers # SkunkWeb 4 which is fine by me. And there are cases when you might want a request or response without really needing the other. For instance, what would be the point of having WebOb's HTTPException classes, which are response subclasses, also be requests? And middleware might not be interested at all in the response -- so why should they deal with an object larded with response-specific attributes, and possibly requiring those attributes to undergo initialization? (Well, there isn't much initialization necessary, I suppose.) Not having to refer to things at times you you don't care about them is an architectural good which offsets to some degree the clumsiness of having two closely related things rather than one when you care about them both. Cheers, js -- Jacob Smullyan From guido at python.org Mon Oct 22 21:40:18 2007 From: guido at python.org (Guido van Rossum) Date: Mon, 22 Oct 2007 12:40:18 -0700 Subject: [Web-SIG] WebOb In-Reply-To: <471CF97D.6070808@colorstudy.com> References: <46C3BCB9.6010708@colorstudy.com> <471CF97D.6070808@colorstudy.com> Message-ID: 2007/10/22, Ian Bicking : > > I briefly looked at the tutorial and was put off a little by the > > interactive prompt style of the examples; that seems so unrealistic > > that I wonder if it wouldn't be better to just say "put this in a file > > and run it like this"? > > The side effect of doctesting is that docs sometimes look weird :-/ Personally, I find doctest a great tool for writing tests in certain situations; not so great for writing docs though. > I'm not sure what form the docs should take. I'm open to suggestions. > The extracted docs are actually reasonable as a reference, I think: > > http://pythonpaste.org/webob/class-webob.Request.html > http://pythonpaste.org/webob/class-webob.Response.html Hm, these are mostly alphabetical listings of individual methods and properties. I'm still hoping for something that I can read from top to bottom in 10 minutes and get an idea of what this is and how to use it. > For realistic use cases, some kind of infrastructure is necessary. How realistic are we talking? I'm thinking of something that I can test by pointing my browser to localhost:8080 or similar. For CGI scripts, the standard library's CGIHTTPServer would suffice. How hard is it to create something similar for WSGI or for webob? > I suppose a simple example using the wsgiref server and a plain WSGI app > would suffice. Even a very small framework (e.g., > http://svn.pythonpaste.org/Paste/apps/FlatAtomPub/trunk/flatatompub/dec.py) > improves that considerably, but probably isn't worth introducing. It's hard to judge that code since it has zero documentation. I was more looking for something that has a main() which is called when invoked as a script. > >> A quick summary of differences in the API and some other > >> request/response objects out there: > >> http://pythonpaste.org/webob/differences.html > >> I'd include more frameworks, if you can point me to their > >> request/response API documentation (e.g., I looked but couldn't find any > >> for Zope 3). > > > > I'm not too familiar with other frameworks (having always hacked my > > own, as it's so easy :-). Any chance of a summary that's not a > > tutorial nor a reference? > > Did you look at the file serving example? > http://pythonpaste.org/webob/file-example.html Thatr's the first thing I looked at, and that prompted my comments above. :-) > I suppose a quick summary would also be possible, covering just the most > important attributes and with a quick listing of others (like all the > properties for the individual HTTP headers). Yes please. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ianb at colorstudy.com Mon Oct 22 23:39:54 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Mon, 22 Oct 2007 16:39:54 -0500 Subject: [Web-SIG] WebOb In-Reply-To: <83E2A065-9529-4E13-8ED6-13C7725879AC@atlas.st> References: <46C3BCB9.6010708@colorstudy.com> <83E2A065-9529-4E13-8ED6-13C7725879AC@atlas.st> Message-ID: <471D18AA.8030702@colorstudy.com> Adam Atlas wrote: > On 22 Oct 2007, at 12:09, William Dode wrote: > >> So, don't you think web-sig should officialy support such library ? >> Include it in the lib stantard or in a wsgiorg library ? >> > > I don't really like the idea of having something like this be part of > the standard library; it's sort of neither here nor there between low- > level WSGI and framework territory. I don't see people using > something like WebOb to write their applications directly (nor does > that seem to be the intention); just like Paste, it seems more like > something that full frameworks would incorporate and provide access to. I am certainly not representative of a normal developer, but I have been using it quite successfully without any framework. It also provides most of the functionality of WebTest, a framework-neutral functional testing tool, as another example. > Given the principle of "there should be one, and preferably only one, > obvious way to do it", it seems like putting this in the standard > library would be an endorsement of it as the obvious/best way, and > although I like the WebOb approach, I don't think there's enough of a > consensus to bless it thus. For now, the multitude of web frameworks > and their various philosophies is a good thing. After actually reading the APIs of the different request objects and summarizing the differences, I feel much less like this. All the major frameworks (and almost all the minor frameworks) have request and response objects with a subset of the same properties, and some slightly different names. The only really substantial exceptions are Zope and CherryPy that have a bunch of traversal-related properties and methods; but even these have some parallels in WebOb. I've also tried to avoid gratuitous incompatibilities with other frameworks, and to allow backward compatibility through subclassing when there are API differences. There's still some tricky details -- for instance, Django uses a different multi-value dictionary API than WebOb uses. Which is the kind of thing that makes me wish *some* multi-value dictionary API existed in the standard library that could serve as a reasonable model. But so it goes. Even there I switched around WebOb some to be closer to Django (to prefer the last value over the first value, when getting a single value when multiple values are available). As for actual consensus, Pylons is committed to using it and TurboGears by association. Jacob Kaplan-Moss and Simon Willison have expressed specific interest in the idea for Django, though I don't think they've had the time to analyze what that would mean specifically. Jacob Smullyan is also using it as we've heard, and I've heard of some other smaller/internal frameworks using it. That's not consensus, but I think it points to the possibility of consensus. As to the standard library, I don't know, there's a lot of issues with its development model. WebOb, unlike a framework, actually *could* match the kind of slow and steady progress that the standard library has. But the stdlib might be a bad target even so. -- Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org : Write code, do good : http://topp.openplans.org/careers From MDiPierro at cti.depaul.edu Tue Oct 23 06:42:48 2007 From: MDiPierro at cti.depaul.edu (Massimo Di Pierro) Date: Mon, 22 Oct 2007 23:42:48 -0500 Subject: [Web-SIG] Gluon again Message-ID: <68C644CF-0429-46ED-8762-CB2C57E0C582@cti.depaul.edu> I posted a Gluon tutorial here http://mdp.cti.depaul.edu/examples/static/cookbook.pdf it shows step by step how to build a web app to store recipes and group them by category. It is a first draft so there are may be some english some typos. Sorry. Massimo P.S. I'll never stress it enough. Gluon is GPL2, it is not a commercial product. The reason I am emailing you about this is because I know I can find experts here and I hope you can help me find bugs so that I can fix them and improve Gluons. If there is functionality that you need and you think is not there, just let me know and I will see what I can do. I would also love to see an ajax enthusiast take the challenge to write the first ajax app using Gluon, scriptaculous and json. I do provide some free email support if you sign up on the Gluon google group. From std3rr at gmail.com Tue Oct 23 06:57:24 2007 From: std3rr at gmail.com (Joshua Simpson) Date: Mon, 22 Oct 2007 21:57:24 -0700 Subject: [Web-SIG] Gluon again In-Reply-To: <68C644CF-0429-46ED-8762-CB2C57E0C582@cti.depaul.edu> References: <68C644CF-0429-46ED-8762-CB2C57E0C582@cti.depaul.edu> Message-ID: <3ed9caa10710222157j3451ad4eg37ac9b53dcf4036e@mail.gmail.com> On 10/22/07, Massimo Di Pierro wrote: it shows step by step how to build a web app to store recipes and > group them by category. > It is a first draft so there are may be some english some typos. Sorry. I'm going to check this out. Are you from a primarily C background? Your builtin functions look, at least in naming convention, suspiciously like macros. Your controller design seems to borrow heavily from Django, but I suppose that's a good thing. Cheers, I always like to look at new frameworks. Josh -- - http://stderr.ws/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/web-sig/attachments/20071022/97429ca5/attachment.htm From ianb at colorstudy.com Tue Oct 23 07:45:35 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 23 Oct 2007 00:45:35 -0500 Subject: [Web-SIG] WebOb In-Reply-To: References: <46C3BCB9.6010708@colorstudy.com> <471CF97D.6070808@colorstudy.com> Message-ID: <471D8A7F.2050405@colorstudy.com> Guido van Rossum wrote: > 2007/10/22, Ian Bicking : >>> I briefly looked at the tutorial and was put off a little by the >>> interactive prompt style of the examples; that seems so unrealistic >>> that I wonder if it wouldn't be better to just say "put this in a file >>> and run it like this"? >> The side effect of doctesting is that docs sometimes look weird :-/ > > Personally, I find doctest a great tool for writing tests in certain > situations; not so great for writing docs though. Yeah... I really like it in a lot of ways, but I'm not quite sure what the right balance is. Untested documentation is also very unfortunate; too much potential for drift. >> I'm not sure what form the docs should take. I'm open to suggestions. >> The extracted docs are actually reasonable as a reference, I think: >> >> http://pythonpaste.org/webob/class-webob.Request.html >> http://pythonpaste.org/webob/class-webob.Response.html > > Hm, these are mostly alphabetical listings of individual methods and > properties. I'm still hoping for something that I can read from top to > bottom in 10 minutes and get an idea of what this is and how to use > it. I redid the front page to make it more brief: http://pythonpaste.org/webob/ I stopped with the example, because I couldn't think of a good example. Maybe a different evening. Suggestions of course welcome. >> For realistic use cases, some kind of infrastructure is necessary. > > How realistic are we talking? I'm thinking of something that I can > test by pointing my browser to localhost:8080 or similar. For CGI > scripts, the standard library's CGIHTTPServer would suffice. How hard > is it to create something similar for WSGI or for webob? Well, some kind of WSGI adapter; the wsgiref one is fine. The file example I guess is boring, because without some kind of dispatch you can only serve up one file. A most boring server. Wiki is a common example, but a little too common at this point. WebOb doesn't offer anything for HTML either, so it would be a somewhat unsatisfying example anyway I suspect. -- Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org From jim at zope.com Tue Oct 23 13:11:53 2007 From: jim at zope.com (Jim Fulton) Date: Tue, 23 Oct 2007 07:11:53 -0400 Subject: [Web-SIG] WebOb In-Reply-To: <471D8A7F.2050405@colorstudy.com> References: <46C3BCB9.6010708@colorstudy.com> <471CF97D.6070808@colorstudy.com> <471D8A7F.2050405@colorstudy.com> Message-ID: > I redid the front page to make it more brief: http:// > pythonpaste.org/webob/ I suggest a paragraph saying what WebOb is, including what problem it is trying to solve. I'd find this interesting as it is not at all clear to me. Jim -- Jim Fulton Zope Corporation From MDiPierro at cti.depaul.edu Tue Oct 23 15:32:56 2007 From: MDiPierro at cti.depaul.edu (Massimo Di Pierro) Date: Tue, 23 Oct 2007 08:32:56 -0500 Subject: [Web-SIG] Gluon again In-Reply-To: <3ed9caa10710222157j3451ad4eg37ac9b53dcf4036e@mail.gmail.com> References: <68C644CF-0429-46ED-8762-CB2C57E0C582@cti.depaul.edu> <3ed9caa10710222157j3451ad4eg37ac9b53dcf4036e@mail.gmail.com> Message-ID: <62AB0C0A-64D0-4E5A-BD69-A689774E9B85@cti.depaul.edu> You probably refer to the fact that validators and helpers are upper case. That is because they are not functions but objects. In fact validators have an internal state (the parameters for the validation, the translated error messages etc.) and helpers have an internal state (because they are aware of form they may contain, their variables and their errors). Example a=FORM(TABLE(TR(TD(INPUT(_name='field',requites=IS_NOT_EMPTY()))),TR (TD(INPUT(_type='submit'))))) if a.accepts(request.vars,session): .... if a.errors:... At its fundamental level I tried to make Gluon similar to Django. For two reasons. I know Django (I taught a class on Django here ad DePaul) and I liked it but I found it has too many functions and too many modules to remember. So I decided to follow a "convention over configuration" approach a la RoR. In Gluon you do not need to import Gluon's modules in your code nor you need to explicitly call the template renderer, for example. Same logic as Django but simpler to use I believe. Massimo To answer your first question: I teach computer science, mostly numerical applications to science and finance, occasionally networking stuff and security. You can say I came from a C++ background. My most important work is fermiqcd a C++ library of parallel lattice quantum chromodynamics. On Oct 22, 2007, at 11:57 PM, Joshua Simpson wrote: > > > On 10/22/07, Massimo Di Pierro wrote: > > it shows step by step how to build a web app to store recipes and > group them by category. > It is a first draft so there are may be some english some typos. > Sorry. > > I'm going to check this out. Are you from a primarily C > background? Your builtin functions look, at least in naming > convention, suspiciously like macros. Your controller design seems > to borrow heavily from Django, but I suppose that's a good thing. > > Cheers, I always like to look at new frameworks. > > Josh > > -- > - > http://stderr.ws/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/web-sig/attachments/20071023/bd4fad05/attachment.htm From guido at python.org Tue Oct 23 16:01:46 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 23 Oct 2007 07:01:46 -0700 Subject: [Web-SIG] WebOb In-Reply-To: <471D8A7F.2050405@colorstudy.com> References: <46C3BCB9.6010708@colorstudy.com> <471CF97D.6070808@colorstudy.com> <471D8A7F.2050405@colorstudy.com> Message-ID: 2007/10/22, Ian Bicking : > I redid the front page to make it more brief: http://pythonpaste.org/webob/ Much better; I'll try to review it in more detail later. Right now a few details jump off the page to me: GET and POST are verbs and IMO poor names for what they represent; params is usually called query (isn't it?); and what's the advantage of using Request.blank() instead of simply Request()? > I stopped with the example, because I couldn't think of a good example. > Maybe a different evening. Suggestions of course welcome. > > >> For realistic use cases, some kind of infrastructure is necessary. > > > > How realistic are we talking? I'm thinking of something that I can > > test by pointing my browser to localhost:8080 or similar. For CGI > > scripts, the standard library's CGIHTTPServer would suffice. How hard > > is it to create something similar for WSGI or for webob? > > Well, some kind of WSGI adapter; the wsgiref one is fine. The file > example I guess is boring, because without some kind of dispatch you can > only serve up one file. A most boring server. > > Wiki is a common example, but a little too common at this point. WebOb > doesn't offer anything for HTML either, so it would be a somewhat > unsatisfying example anyway I suspect. The file-serving example has several shortcomings: the presentation order seems odd, some things are introduced without explanation of what or why. (Why is UTC imported? Why is mimetypes imported twice? Why bother with calculating the mime-type at all in the first example?) Towards the end it seems to go into too many details of serving up conditional responses and file ranges, which seem better suited for an advanced manual. I suggest the wiki-in-one-page would be a better example, even if you consider it too common (serving static files isn't common? :-). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tseaver at palladion.com Tue Oct 23 16:14:47 2007 From: tseaver at palladion.com (Tres Seaver) Date: Tue, 23 Oct 2007 10:14:47 -0400 Subject: [Web-SIG] WebOb In-Reply-To: References: <46C3BCB9.6010708@colorstudy.com> <471CF97D.6070808@colorstudy.com> <471D8A7F.2050405@colorstudy.com> Message-ID: <471E01D7.2050605@palladion.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Guido van Rossum wrote: > 2007/10/22, Ian Bicking : >> I redid the front page to make it more brief: http://pythonpaste.org/webob/ > > Much better; I'll try to review it in more detail later. Right now a > few details jump off the page to me: GET and POST are verbs and IMO > poor names for what they represent; Just MHO: I don't find them that confusing. Would names like 'GET_data' and 'POST_data' be clearer? Coming from Zope land, I'm not used to caring about the distinction between GET and POST (for purposes of examining the parameters passed in the request), so I'd probably use 'params' instead. > params is usually called query (isn't it?); Depends on what you mean by "usually": in Zope, this is called 'form', and it represents either the parsed query string (for GET requests) or the parsed form data from the payload (for POST requests). > and what's the advantage of using Request.blank() instead > of simply Request()? 'blank' represents an unusual case: fabricating a request object without having a WSGI-compliant environment dict already in hand. I kind of like simplifying the "mainline" case (__init__ doesn't have to sniff whether you passed an environment or not: you get a TypeError if you try). - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHHgHX+gerLs4ltQ4RAgJtAKCR4s2LFi/Nb4aYgF/aLilwa+PvnwCaAxpI BsTZMtcoY+NJpI3EQ/RkBKg= =RQSZ -----END PGP SIGNATURE----- From wilk at flibuste.net Tue Oct 23 16:45:55 2007 From: wilk at flibuste.net (William Dode) Date: Tue, 23 Oct 2007 14:45:55 +0000 (UTC) Subject: [Web-SIG] WebOb References: <46C3BCB9.6010708@colorstudy.com> <471CF97D.6070808@colorstudy.com> <471D8A7F.2050405@colorstudy.com> Message-ID: On 23-10-2007, Ian Bicking wrote: > I redid the front page to make it more brief: > http://pythonpaste.org/webob/ Fine. I had to use it to understand what is the benefit of webob, the examples was not very clear in the first read. The yaro's page was more clear to me for example. > > I stopped with the example, because I couldn't think of a good example. > Maybe a different evening. Suggestions of course welcome. The problem will be to be practical but don't look like 'yet another framework' ! I liked your do-it-yourself-framework. Maybe a webob-only version ? Each example should run alone with copy-paste and wsgiref as server. Without webob: -------------- import wsgiref.simple_server def app(environ, start_response): start_response('200 OK', [('content-type', 'text/html')]) return ['Hello world!'] wsgiref.simple_server.make_server('', 8080, app).serve_forever() With webob: ----------- import wsgiref.simple_server from webob import Response, Request def app(environ, start_response): req = Request(environ) res = Response(content_type='text/html') res.body = 'Hello world!' return res(environ, start_response) wsgiref.simple_server.make_server('', 8080, app).serve_forever() With form : ----------- import wsgiref.simple_server from webob import Response, Request def app(environ, start_response): req = Request(environ) res = Response(content_type='text/html') you = req.params.get('you') if you: res.body_file.write('Hello %s' % you) res.body_file.write('''
Who are you ?
''') return res(environ, start_response) wsgiref.simple_server.make_server('', 8080, app).serve_forever() with form and cookies : ----------------------- import wsgiref.simple_server from webob import Response, Request def app(environ, start_response): req = Request(environ) res = Response(content_type='text/html') you_cookie = req.cookies.get('you') if you_cookie: res.body_file.write('I recognize you %s
' % you_cookie) you = req.params.get('you', you_cookie) if you: res.body_file.write('Hello %s' % you) res.set_cookie('you', you) res.body_file.write('''
Who are you ?
''') return res(environ, start_response) wsgiref.simple_server.make_server('', 8080, app).serve_forever() -- William Dod? - http://flibuste.net Informaticien ind?pendant From ianb at colorstudy.com Tue Oct 23 19:33:26 2007 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 23 Oct 2007 12:33:26 -0500 Subject: [Web-SIG] WebOb In-Reply-To: References: <46C3BCB9.6010708@colorstudy.com> <471CF97D.6070808@colorstudy.com> <471D8A7F.2050405@colorstudy.com> Message-ID: <471E3066.6050200@colorstudy.com> Guido van Rossum wrote: > 2007/10/22, Ian Bicking : >> I redid the front page to make it more brief: http://pythonpaste.org/webob/ > > Much better; I'll try to review it in more detail later. Right now a > few details jump off the page to me: GET and POST are verbs and IMO > poor names for what they represent; I generally agree, and initially they were named queryvars and postvars. But I provided GET and POST aliases for compatibility with both Pylons and Django, and then I kind of decided that though they are technically incorrect (e.g., GET variables are really query string variables, and can be present in POST requests) that it wasn't worth the ambiguity of aliases, and I didn't want to just change the names. > params is usually called query (isn't it?); I'm not aware of any particular convention for this. In Django it's request.REQUEST, in Werkzeug it is req.values, in Webware it was accessed with request.value(name), and I believe CherryPy uses request.params. So there isn't any convention that I know of. > and what's the advantage of using Request.blank() instead > of simply Request()? As Tres said, it creates a request from scratch, building the WSGI dictionary. I use it for testing and potentially for artificial requests or subrequests (though subrequests usually work better with request.copy_get()). When you are serving an application the WSGI environment will always come from the WSGI server. >> I stopped with the example, because I couldn't think of a good example. >> Maybe a different evening. Suggestions of course welcome. >> >>>> For realistic use cases, some kind of infrastructure is necessary. >>> How realistic are we talking? I'm thinking of something that I can >>> test by pointing my browser to localhost:8080 or similar. For CGI >>> scripts, the standard library's CGIHTTPServer would suffice. How hard >>> is it to create something similar for WSGI or for webob? >> Well, some kind of WSGI adapter; the wsgiref one is fine. The file >> example I guess is boring, because without some kind of dispatch you can >> only serve up one file. A most boring server. >> >> Wiki is a common example, but a little too common at this point. WebOb >> doesn't offer anything for HTML either, so it would be a somewhat >> unsatisfying example anyway I suspect. > > The file-serving example has several shortcomings: the presentation > order seems odd, some things are introduced without explanation of > what or why. (Why is UTC imported? Why is mimetypes imported twice? > Why bother with calculating the mime-type at all in the first > example?) Towards the end it seems to go into too many details of > serving up conditional responses and file ranges, which seem better > suited for an advanced manual. > > I suggest the wiki-in-one-page would be a better example, even if you > consider it too common (serving static files isn't common? :-). But I love static files! I wonder if there's an interesting piece of middleware I could do -- WebOb makes middleware much easier IMHO. Of course, it's only interesting if you have something on the other end of your middleware. Maybe a backend app that serves files and knows GET and PUT, and then middleware that turns it into a wiki? Or is that too clever? Authentication middleware with a login page? Maybe too meta. -- Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org From MDiPierro at cti.depaul.edu Tue Oct 30 06:26:00 2007 From: MDiPierro at cti.depaul.edu (Massimo Di Pierro) Date: Tue, 30 Oct 2007 00:26:00 -0500 Subject: [Web-SIG] wsgi? Message-ID: I am trying to use Gluon with Apache and mod_wsgi. This is how Gluon starts now using Paste httpserver (serve) def main(ip='127.0.0.1',port=8000): serve(wsgibase,server_version="Something", host=ip, port=str(port)) I am not looking for explanation, I can figure it out myself, it is the time that is lacking. I am looking for a wsgi expert who is interested in Gluon and is willing to try set it up with wsgi and submit one page of documentation on how to do it, in exchange for a lousy acknowledgment on the Gluon web site. Massimo