From manlio_perillo at libero.it  Mon Oct  1 17:47:49 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Mon, 01 Oct 2007 17:47:49 +0200
Subject: [Web-SIG] hop-by-hop headers handling
Message-ID: <470116A5.7010807@libero.it>

Hi, I have another question with error handling.

The WSGI spec only says that applications *must* not generate hop-by-hop
headers, but says nothing on how a WSGI server should handle them.

In the previous version of nginx mod_wsgi I just ignored these headers,
but in the latest revisions, I raise an exception.


Thanks   Manlio Perillo

From manlio_perillo at libero.it  Tue Oct  2 21:30:46 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Tue, 02 Oct 2007 21:30:46 +0200
Subject: [Web-SIG] Multiple message-header fields handling
Message-ID: <47029C66.5090408@libero.it>

The HTTP 1.1 protocol (section 4.2) says that:
"""Multiple message-header fields with the same field-name MAY be 
present in a message if and only if the entire field-value for that 
header field is defined as a comma-separated list [i.e., #(values)]."""

This can happen, as an example, with the Cookie header.

My question is: how should this be handled in WSGI?

As an example Nginx stores all the headers in a associative array, 
where, of course, only the "last seen" headers appears.

However common multiple message-headers are stored in the request struct.

Since the WSGI environment is a dictionary with keys and values of type 
str, should an implementation:
"""combine the multiple header fields into one "field-name: field-value" 
pair, without changing the semantics of the message, by appending each 
subsequent field-value to the first, each separated by a comma."""
?

Ngins does not do this (and I don't know what Apache does).


Another question: when an header has an empty field value, what should 
be set in the environment: an empty string or None?


Thanks  Manlio Perillo

From pje at telecommunity.com  Tue Oct  2 21:44:05 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 02 Oct 2007 15:44:05 -0400
Subject: [Web-SIG] Multiple message-header fields handling
In-Reply-To: <47029C66.5090408@libero.it>
References: <47029C66.5090408@libero.it>
Message-ID: <20071002194130.A118D3A407A@sparrow.telecommunity.com>

At 09:30 PM 10/2/2007 +0200, Manlio Perillo wrote:
>The HTTP 1.1 protocol (section 4.2) says that:
>"""Multiple message-header fields with the same field-name MAY be
>present in a message if and only if the entire field-value for that
>header field is defined as a comma-separated list [i.e., #(values)]."""
>
>This can happen, as an example, with the Cookie header.
>
>My question is: how should this be handled in WSGI?
>
>As an example Nginx stores all the headers in a associative array,
>where, of course, only the "last seen" headers appears.
>
>However common multiple message-headers are stored in the request struct.
>
>Since the WSGI environment is a dictionary with keys and values of type
>str, should an implementation:
>"""combine the multiple header fields into one "field-name: field-value"
>pair, without changing the semantics of the message, by appending each
>subsequent field-value to the first, each separated by a comma."""
>?

If that's the only way to make the headers work, then the server may do so.


>Another question: when an header has an empty field value, what should
>be set in the environment: an empty string or None?

If a value exists in the environ, it *must* be a string -- never 
None.  And if the header exists, then a value should be in the 
environ.  Therefore, it should be an empty string.


From pje at telecommunity.com  Tue Oct  2 21:45:29 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 02 Oct 2007 15:45:29 -0400
Subject: [Web-SIG] hop-by-hop headers handling
In-Reply-To: <470116A5.7010807@libero.it>
References: <470116A5.7010807@libero.it>
Message-ID: <20071002194251.F019C3A407C@sparrow.telecommunity.com>

At 05:47 PM 10/1/2007 +0200, Manlio Perillo wrote:
>Hi, I have another question with error handling.
>
>The WSGI spec only says that applications *must* not generate hop-by-hop
>headers, but says nothing on how a WSGI server should handle them.
>
>In the previous version of nginx mod_wsgi I just ignored these headers,
>but in the latest revisions, I raise an exception.

Raising an exception is indeed preferable.


From alex at puddlejumper.foxybanana.com  Tue Oct  2 21:50:21 2007
From: alex at puddlejumper.foxybanana.com (Alex Botero-Lowry)
Date: Tue, 2 Oct 2007 12:50:21 -0700
Subject: [Web-SIG] Multiple message-header fields handling
In-Reply-To: <47029C66.5090408@libero.it>
References: <47029C66.5090408@libero.it>
Message-ID: <20071002195021.GA6658@puddlejumper.foxybanana.com>

On Tue, Oct 02, 2007 at 09:30:46PM +0200, Manlio Perillo wrote:
> The HTTP 1.1 protocol (section 4.2) says that:
> """Multiple message-header fields with the same field-name MAY be 
> present in a message if and only if the entire field-value for that 
> header field is defined as a comma-separated list [i.e., #(values)]."""
> 
> This can happen, as an example, with the Cookie header.
> 
> My question is: how should this be handled in WSGI?
> 
> As an example Nginx stores all the headers in a associative array, 
> where, of course, only the "last seen" headers appears.
> 
> However common multiple message-headers are stored in the request struct.
> 
Initially I used such a solution (cookies was a special property in the response
object), but I ended up just throwing together a custom dict that looks like:

class ResponseHeaders(dict):
        def __setitem__(self, item, val):
                if item in self:
                        iv = self[item]
                        if isinstance(iv, list):
                                iv.append(val)
                        else:
                                iv = [iv, val]
                        dict.__setitem__(self, item, iv)
                else:
                        dict.__setitem__(self, item, val)

        def replace(self, key, val):
                dict.__setitem__(self, key, val)

        def items(self):
                ret = []
                for k,v in dict.items(self):
                        if isinstance(v, list):
                                ret.extend([ (k, a) for a in v ])
                        else:
                                ret.append((k, v))
                return ret

        def iteritems(self):
                return iter(self.items())

It's really intended for passing the headers on to start_response, and for
getting the headers into it, rather then for reading from it, which is fine
99% of the time. I recently had to add replace since i had a situation where
I needed to overwrite a preset header, but other than that it serves me well.

Alex

From fumanchu at aminus.org  Tue Oct  2 21:47:57 2007
From: fumanchu at aminus.org (Robert Brewer)
Date: Tue, 2 Oct 2007 12:47:57 -0700
Subject: [Web-SIG] Multiple message-header fields handling
References: <47029C66.5090408@libero.it>
Message-ID: <F1962646D3B64642B7C9A06068EE1E6418B399@ex10.hostedexchange.local>

Manlio Perillo wrote:
> The HTTP 1.1 protocol (section 4.2) says that:
> """Multiple message-header fields with the same field-name MAY be 
> present in a message if and only if the entire field-value for that 
> header field is defined as a comma-separated list [i.e., #(values)]."""
> 
> This can happen, as an example, with the Cookie header.
> 
> My question is: how should this be handled in WSGI?
> 
> As an example Nginx stores all the headers in a associative array, 
> where, of course, only the "last seen" headers appears.
> 
> However common multiple message-headers are stored in the request struct.
> 
> Since the WSGI environment is a dictionary with keys and values of type 
> str, should an implementation:
> """combine the multiple header fields into one "field-name: field-value" 
> pair, without changing the semantics of the message, by appending each 
> subsequent field-value to the first, each separated by a comma."""
> ?

Yes, it should. As you note, it's part of the HTTP spec that such headers
can be combined without changing the semantics. Here's a list of the
headers that need to be folded:

comma_separated_headers = ['ACCEPT', 'ACCEPT-CHARSET', 'ACCEPT-ENCODING',
    'ACCEPT-LANGUAGE', 'ACCEPT-RANGES', 'ALLOW', 'CACHE-CONTROL',
    'CONNECTION', 'CONTENT-ENCODING', 'CONTENT-LANGUAGE', 'EXPECT',
    'IF-MATCH', 'IF-NONE-MATCH', 'PRAGMA', 'PROXY-AUTHENTICATE', 'TE',
    'TRAILER', 'TRANSFER-ENCODING', 'UPGRADE', 'VARY', 'VIA', 'WARNING',
    'WWW-AUTHENTICATE']

The only tricky one is Cookie, because e.g. Konqueror sends them on
multiple lines, but they're not foldable.

See http://kristol.org/cookie/errata.html

> Ngins does not do this (and I don't know what Apache does).
> 
> 
> Another question: when an header has an empty field value, what should 
> be set in the environment: an empty string or None?

An empty string, or omit them entirely:

"""The following variables must be present, unless their value would
be an empty string, in which case they may be omitted, except as
otherwise noted below...

HTTP_ Variables
""".


Robert Brewer
fumanchu at aminus.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/web-sig/attachments/20071002/bebc9863/attachment.htm 

From manlio_perillo at libero.it  Tue Oct  2 22:03:50 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Tue, 02 Oct 2007 22:03:50 +0200
Subject: [Web-SIG] Multiple message-header fields handling
In-Reply-To: <47029C66.5090408@libero.it>
References: <47029C66.5090408@libero.it>
Message-ID: <4702A426.30109@libero.it>

Manlio Perillo ha scritto:
> [...]
> As an example Nginx stores all the headers in a associative array, 
> where, of course, only the "last seen" headers appears.
> 

A correction: Nginx stores "raw" headers in a list of key/value pairs, 
and not in an associative array.

This means that when I iterate over the headers, I see all the multiple 
message-headers, but I only store the last header in the WSGI environment.

 > [...]


Regards  Manlio Perillo

From manlio_perillo at libero.it  Tue Oct  2 22:11:40 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Tue, 02 Oct 2007 22:11:40 +0200
Subject: [Web-SIG] Multiple message-header fields handling
In-Reply-To: <20071002194130.A118D3A407A@sparrow.telecommunity.com>
References: <47029C66.5090408@libero.it>
	<20071002194130.A118D3A407A@sparrow.telecommunity.com>
Message-ID: <4702A5FC.8030000@libero.it>

Phillip J. Eby ha scritto:
> At 09:30 PM 10/2/2007 +0200, Manlio Perillo wrote:
>> The HTTP 1.1 protocol (section 4.2) says that:
>> """Multiple message-header fields with the same field-name MAY be
>> present in a message if and only if the entire field-value for that
>> header field is defined as a comma-separated list [i.e., #(values)]."""
>>
>> This can happen, as an example, with the Cookie header.
>>
>> My question is: how should this be handled in WSGI?
>>
>> As an example Nginx stores all the headers in a associative array,
>> where, of course, only the "last seen" headers appears.
>>
>> However common multiple message-headers are stored in the request struct.
>>
>> Since the WSGI environment is a dictionary with keys and values of type
>> str, should an implementation:
>> """combine the multiple header fields into one "field-name: field-value"
>> pair, without changing the semantics of the message, by appending each
>> subsequent field-value to the first, each separated by a comma."""
>> ?
> 
> If that's the only way to make the headers work, then the server may do so.
> 

Nginx does not combine headers, so I have to do it by myself (and this 
will complicate the implementation)...

However IMHO here you should not use the word "may", but "must", and 
this should be explicitly stated in the WSGI spec.


 > [...]


Thanks and regards   Manlio Perillo

From manlio_perillo at libero.it  Tue Oct  2 22:27:12 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Tue, 02 Oct 2007 22:27:12 +0200
Subject: [Web-SIG] Multiple message-header fields handling
In-Reply-To: <F1962646D3B64642B7C9A06068EE1E6418B399@ex10.hostedexchange.local>
References: <47029C66.5090408@libero.it>
	<F1962646D3B64642B7C9A06068EE1E6418B399@ex10.hostedexchange.local>
Message-ID: <4702A9A0.2090005@libero.it>

Robert Brewer ha scritto:
>
 > [...]
> As you note, it's part of the HTTP spec that such headers
> can be combined without changing the semantics. Here's a list of the
> headers that need to be folded:
> 
> comma_separated_headers = ['ACCEPT', 'ACCEPT-CHARSET', 'ACCEPT-ENCODING',
>     'ACCEPT-LANGUAGE', 'ACCEPT-RANGES', 'ALLOW', 'CACHE-CONTROL',
>     'CONNECTION', 'CONTENT-ENCODING', 'CONTENT-LANGUAGE', 'EXPECT',
>     'IF-MATCH', 'IF-NONE-MATCH', 'PRAGMA', 'PROXY-AUTHENTICATE', 'TE',
>     'TRAILER', 'TRANSFER-ENCODING', 'UPGRADE', 'VARY', 'VIA', 'WARNING',
>     'WWW-AUTHENTICATE']
> 

Note that some of these headers are response headers, and it is 
responsibility of the WSGI application to properly folding them, and not 
of the WSGI gateway.


> The only tricky one is Cookie, because e.g. Konqueror sends them on
> multiple lines, but they're not foldable.
> 
> See http://kristol.org/cookie/errata.html
> 

This is a mess...

Note: in some tests, I have seen Firefox sending a Cookie on multiple lines.

 > [...]


Thanks and regards   Manlio Perillo

From pje at telecommunity.com  Tue Oct  2 22:36:47 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 02 Oct 2007 16:36:47 -0400
Subject: [Web-SIG] Multiple message-header fields handling
In-Reply-To: <4702A426.30109@libero.it>
References: <47029C66.5090408@libero.it>
 <4702A426.30109@libero.it>
Message-ID: <20071002203514.115E03A407C@sparrow.telecommunity.com>

At 10:03 PM 10/2/2007 +0200, Manlio Perillo wrote:
>Manlio Perillo ha scritto:
> > [...]
> > As an example Nginx stores all the headers in a associative array,
> > where, of course, only the "last seen" headers appears.
> >
>
>A correction: Nginx stores "raw" headers in a list of key/value pairs,
>and not in an associative array.
>
>This means that when I iterate over the headers, I see all the multiple
>message-headers, but I only store the last header in the WSGI environment.

That's definitely an error.


From manlio_perillo at libero.it  Tue Oct  2 23:01:33 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Tue, 02 Oct 2007 23:01:33 +0200
Subject: [Web-SIG] Multiple message-header fields handling
In-Reply-To: <20071002203514.115E03A407C@sparrow.telecommunity.com>
References: <47029C66.5090408@libero.it> <4702A426.30109@libero.it>
	<20071002203514.115E03A407C@sparrow.telecommunity.com>
Message-ID: <4702B1AD.3090807@libero.it>

Phillip J. Eby ha scritto:
> At 10:03 PM 10/2/2007 +0200, Manlio Perillo wrote:
>> Manlio Perillo ha scritto:
>> > [...]
>> > As an example Nginx stores all the headers in a associative array,
>> > where, of course, only the "last seen" headers appears.
>> >
>>
>> A correction: Nginx stores "raw" headers in a list of key/value pairs,
>> and not in an associative array.
>>
>> This means that when I iterate over the headers, I see all the multiple
>> message-headers, but I only store the last header in the WSGI 
>> environment.
> 
> That's definitely an error.

Right, its an error.

A simple solution is to first check if an header is already in the 
environ. If this is the case, then I can combine the new value with the 
old one.

The problem, is that I have first to check if the header can be combined 
(and the Cookie must be combined using ';' instead of ',').

Luckily some of these headers can be handled internally by Nginx.
How many browsers split an header on multiple lines?


Regards  Manlio Perillo

From pje at telecommunity.com  Tue Oct  2 23:08:53 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue, 02 Oct 2007 17:08:53 -0400
Subject: [Web-SIG] Multiple message-header fields handling
In-Reply-To: <4702A9A0.2090005@libero.it>
References: <47029C66.5090408@libero.it>
	<F1962646D3B64642B7C9A06068EE1E6418B399@ex10.hostedexchange.local>
	<4702A9A0.2090005@libero.it>
Message-ID: <20071002210615.E678F3A407A@sparrow.telecommunity.com>

At 10:27 PM 10/2/2007 +0200, Manlio Perillo wrote:
>Robert Brewer ha scritto:
> >
>  > [...]
> > As you note, it's part of the HTTP spec that such headers
> > can be combined without changing the semantics. Here's a list of the
> > headers that need to be folded:
> >
> > comma_separated_headers = ['ACCEPT', 'ACCEPT-CHARSET', 'ACCEPT-ENCODING',
> >     'ACCEPT-LANGUAGE', 'ACCEPT-RANGES', 'ALLOW', 'CACHE-CONTROL',
> >     'CONNECTION', 'CONTENT-ENCODING', 'CONTENT-LANGUAGE', 'EXPECT',
> >     'IF-MATCH', 'IF-NONE-MATCH', 'PRAGMA', 'PROXY-AUTHENTICATE', 'TE',
> >     'TRAILER', 'TRANSFER-ENCODING', 'UPGRADE', 'VARY', 'VIA', 'WARNING',
> >     'WWW-AUTHENTICATE']
> >
>
>Note that some of these headers are response headers, and it is
>responsibility of the WSGI application to properly folding them, and not
>of the WSGI gateway.

On the contrary.  The gateway is responsible for sending *all* the 
header lines to the client.  If you're only taking the last one, your 
gateway is non-compliant.

If nginx can't handle multiple headers, the only way you can be WSGI 
compliant is to do the folding in the gateway, because the 
application is explicitly allowed to provide multiple header values 
for a given header name.


From manlio_perillo at libero.it  Tue Oct  2 23:35:51 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Tue, 02 Oct 2007 23:35:51 +0200
Subject: [Web-SIG] Multiple message-header fields handling
In-Reply-To: <20071002210615.E678F3A407A@sparrow.telecommunity.com>
References: <47029C66.5090408@libero.it>
	<F1962646D3B64642B7C9A06068EE1E6418B399@ex10.hostedexchange.local>
	<4702A9A0.2090005@libero.it>
	<20071002210615.E678F3A407A@sparrow.telecommunity.com>
Message-ID: <4702B9B7.7020101@libero.it>

Phillip J. Eby ha scritto:
> [...]
>> Note that some of these headers are response headers, and it is
>> responsibility of the WSGI application to properly folding them, and not
>> of the WSGI gateway.
> 
> On the contrary.  The gateway is responsible for sending *all* the 
> header lines to the client.  If you're only taking the last one, your 
> gateway is non-compliant.
> 

You are right, sorry.
I forgot that start_application returns a list, and not a dict.

The current implementation of mod_wsgi is compliant here, and the 
headers are combined.

 > [...]


Regards   Manlio Perillo

From manlio_perillo at libero.it  Wed Oct  3 13:35:16 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Wed, 03 Oct 2007 13:35:16 +0200
Subject: [Web-SIG] Multiple message-header fields handling
In-Reply-To: <4702B9B7.7020101@libero.it>
References: <47029C66.5090408@libero.it>	<F1962646D3B64642B7C9A06068EE1E6418B399@ex10.hostedexchange.local>	<4702A9A0.2090005@libero.it>	<20071002210615.E678F3A407A@sparrow.telecommunity.com>
	<4702B9B7.7020101@libero.it>
Message-ID: <47037E74.8050400@libero.it>

Manlio Perillo ha scritto:
> Phillip J. Eby ha scritto:
>> [...]
>>> Note that some of these headers are response headers, and it is
>>> responsibility of the WSGI application to properly folding them, and not
>>> of the WSGI gateway.
>> On the contrary.  The gateway is responsible for sending *all* the 
>> header lines to the client.  If you're only taking the last one, your 
>> gateway is non-compliant.
>>
> 
> You are right, sorry.
> I forgot that start_application returns a list, and not a dict.
> 
> The current implementation of mod_wsgi is compliant here, and the 
> headers are combined.
> 

A correction: Nginx does not "folds" the multiline headers, they where 
folded by Firefox.


Regards  Manlio Perillo

From manlio_perillo at libero.it  Wed Oct  3 16:57:37 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Wed, 03 Oct 2007 16:57:37 +0200
Subject: [Web-SIG] [extension] x-wsgiorg.flush
Message-ID: <4703ADE1.5040507@libero.it>

Hi.

Nginx, in one of the headers filters, can do ETag and Last-Modified 
validation.

I want to be able to use this feature, so I don't have to use thirdy 
party solutions.

However with the current WSGI implementation this is not possible.

A possibile solution can be to add an extension `x-wsgiorg.flush`, a 
callable object that notify the WSGI gateway that it can flush the 
headers (if they are not yet be sent) or the output buffer (Nginx has 
this feature, however I have yet not understand how it works).

   start_response('200 Ok', [('Last-Modified', 'xxx')])

   ...
   environ['x-wsgiorg.flush']()

   return a-generator


The WSGI gateway can now send the headers before iterating over the 
generator, and if the client content is up-to-date, the new content is 
never generated.


The intent of this extension is to be transparent to the WSGI application.
In case of nginx mod_wsgi, the validation can be done by Nginx, but for 
generic WSGI applications this can be done by a middleware.


I don't know if this feature is feasible, since I have not yet 
implemented it, so I would like to receive some feedbacks.


Thanks  Manlio Perillo

From pje at telecommunity.com  Wed Oct  3 18:52:57 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 03 Oct 2007 12:52:57 -0400
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <4703ADE1.5040507@libero.it>
References: <4703ADE1.5040507@libero.it>
Message-ID: <20071003165020.23FAA3A407A@sparrow.telecommunity.com>

Thinking about this made me realize that WSGI 2.0 isn't going to be 
able to validate *anything* about a response by raising an error in 
the application, because everything is done after the code returns.

That suggests to me that these sorts of errors should be handled by 
changing the response sent to the browser, instead.  That is, sending 
an internal error message to the browser and logging details of the problem.


At 04:57 PM 10/3/2007 +0200, Manlio Perillo wrote:
>Hi.
>
>Nginx, in one of the headers filters, can do ETag and Last-Modified
>validation.
>
>I want to be able to use this feature, so I don't have to use thirdy
>party solutions.
>
>However with the current WSGI implementation this is not possible.
>
>A possibile solution can be to add an extension `x-wsgiorg.flush`, a
>callable object that notify the WSGI gateway that it can flush the
>headers (if they are not yet be sent) or the output buffer (Nginx has
>this feature, however I have yet not understand how it works).
>
>    start_response('200 Ok', [('Last-Modified', 'xxx')])
>
>    ...
>    environ['x-wsgiorg.flush']()
>
>    return a-generator
>
>
>The WSGI gateway can now send the headers before iterating over the
>generator, and if the client content is up-to-date, the new content is
>never generated.
>
>
>
>The intent of this extension is to be transparent to the WSGI application.
>In case of nginx mod_wsgi, the validation can be done by Nginx, but for
>generic WSGI applications this can be done by a middleware.
>
>
>I don't know if this feature is feasible, since I have not yet
>implemented it, so I would like to receive some feedbacks.
>
>
>Thanks  Manlio Perillo
>_______________________________________________
>Web-SIG mailing list
>Web-SIG at python.org
>Web SIG: http://www.python.org/sigs/web-sig
>Unsubscribe: 
>http://mail.python.org/mailman/options/web-sig/pje%40telecommunity.com


From manlio_perillo at libero.it  Wed Oct  3 19:03:46 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Wed, 03 Oct 2007 19:03:46 +0200
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <20071003165020.23FAA3A407A@sparrow.telecommunity.com>
References: <4703ADE1.5040507@libero.it>
	<20071003165020.23FAA3A407A@sparrow.telecommunity.com>
Message-ID: <4703CB72.6080308@libero.it>

Phillip J. Eby ha scritto:
> Thinking about this made me realize that WSGI 2.0 isn't going to be able 
> to validate *anything* about a response by raising an error in the 
> application, because everything is done after the code returns.
> 

In this case, if the cache validation fails, we just have to generate 
the body content.

For which cases do you want to raise an exception?

> That suggests to me that these sorts of errors should be handled by 
> changing the response sent to the browser, instead.  

Right.
In this case Nginx, when the cache is fresh, should change the response 
code from 200 (OK) to 304 (Not Modified).

If I'm right, the current WSGI spec does not forbids or allows this.

> That is, sending an 
> internal error message to the browser and logging details of the problem.
> 


Regards  Manlio Perillo

From pje at telecommunity.com  Wed Oct  3 20:00:48 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 03 Oct 2007 14:00:48 -0400
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <4703CB72.6080308@libero.it>
References: <4703ADE1.5040507@libero.it>
	<20071003165020.23FAA3A407A@sparrow.telecommunity.com>
	<4703CB72.6080308@libero.it>
Message-ID: <20071003175813.7DCEA3A407A@sparrow.telecommunity.com>

At 07:03 PM 10/3/2007 +0200, Manlio Perillo wrote:
>Phillip J. Eby ha scritto:
> > Thinking about this made me realize that WSGI 2.0 isn't going to be able
> > to validate *anything* about a response by raising an error in the
> > application, because everything is done after the code returns.
> >
>
>In this case, if the cache validation fails, we just have to generate
>the body content.
>
>For which cases do you want to raise an exception?

Sorry, I thought you were talking about validating headers for 
*errors* (e.g. WSGI compliance problems), not providing special 
support for If-* headers.

I don't think there's any point to having a WSGI extension for If-* 
header support.  All the necessary data is in the environment, so it 
can trivially be implemented as a library or middleware, especially 
if the application postpones body content generation to an iterator.

Since WSGI is intended to reduce web framework proliferation, one 
should never implement with middleware or a WSGI extension anything 
that can just be released as a library for others to use.


> > That suggests to me that these sorts of errors should be handled by
> > changing the response sent to the browser, instead.
>
>Right.
>In this case Nginx, when the cache is fresh, should change the response
>code from 200 (OK) to 304 (Not Modified).
>
>If I'm right, the current WSGI spec does not forbids or allows this.

Actually, I was talking about handling the case of an invalid (ie. 
non-WSGI/HTTP compliant) header, not cache handling.  Sorry for the confusion.


From manlio_perillo at libero.it  Wed Oct  3 20:24:05 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Wed, 03 Oct 2007 20:24:05 +0200
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <20071003175813.7DCEA3A407A@sparrow.telecommunity.com>
References: <4703ADE1.5040507@libero.it>
	<20071003165020.23FAA3A407A@sparrow.telecommunity.com>
	<4703CB72.6080308@libero.it>
	<20071003175813.7DCEA3A407A@sparrow.telecommunity.com>
Message-ID: <4703DE45.6010606@libero.it>

Phillip J. Eby ha scritto:
> At 07:03 PM 10/3/2007 +0200, Manlio Perillo wrote:
>> Phillip J. Eby ha scritto:
>> > Thinking about this made me realize that WSGI 2.0 isn't going to be 
>> able
>> > to validate *anything* about a response by raising an error in the
>> > application, because everything is done after the code returns.
>> >
>>
>> In this case, if the cache validation fails, we just have to generate
>> the body content.
>>
>> For which cases do you want to raise an exception?
> 
> Sorry, I thought you were talking about validating headers for *errors* 
> (e.g. WSGI compliance problems), not providing special support for If-* 
> headers.
> 

Ok, my message was not very clear.

> I don't think there's any point to having a WSGI extension for If-* 
> header support.  All the necessary data is in the environment, so it can 
> trivially be implemented as a library or middleware, especially if the 
> application postpones body content generation to an iterator.
> 
> Since WSGI is intended to reduce web framework proliferation, one should 
> never implement with middleware or a WSGI extension anything that can 
> just be released as a library for others to use.
> 

In general this is true, however to add support for If- headers, I do 
not have to write any code, all I need is to be able to send the headers 
before the body content is generated.

A wsgiorg.flush extension can be useful for some other things.

As an example, when in Nginx we send some data, an output buffer like 
gzip can buffer data for efficienty, and with wsgiorg.flush a WSGI 
application can force the buffer to be flushed (ok, the WSGI already 
states that the WSGI gateway should not buffer the data).

Note that in Nginx, unlike Apache, an output buffer can process a 
partial buffer, so, for a WSGI application like:

    start_response('200 OK', [...])

    yield 'xxx'
    yield 'yyy'
    yield 'zzz'


the 'xxx' string is sent to the next output buffer, and, finally it is 
sent to the client.

Now can happens that the socket is not ready to send further data, so 
the application must be paused until the socket is ready.

When the socket is ready, the next buffer can be sent to the next outpup 
buffer, and so on.

NOTE: this is not yet implemented in nginx mod_wsgi.


 > [...]


Regards  Manlio Perillo

From manlio_perillo at libero.it  Wed Oct  3 20:33:48 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Wed, 03 Oct 2007 20:33:48 +0200
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <4703DE45.6010606@libero.it>
References: <4703ADE1.5040507@libero.it>	<20071003165020.23FAA3A407A@sparrow.telecommunity.com>	<4703CB72.6080308@libero.it>	<20071003175813.7DCEA3A407A@sparrow.telecommunity.com>
	<4703DE45.6010606@libero.it>
Message-ID: <4703E08C.2070704@libero.it>

Manlio Perillo ha scritto:
> [...]
> Note that in Nginx, unlike Apache, an output buffer can process a 
> partial buffer, 


Sorry, this is not correct.

The only difference from Apache, here, is that the data is written 
asynchronously.


Manlio Perillo

From pje at telecommunity.com  Wed Oct  3 21:23:32 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 03 Oct 2007 15:23:32 -0400
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <4703DE45.6010606@libero.it>
References: <4703ADE1.5040507@libero.it>
	<20071003165020.23FAA3A407A@sparrow.telecommunity.com>
	<4703CB72.6080308@libero.it>
	<20071003175813.7DCEA3A407A@sparrow.telecommunity.com>
	<4703DE45.6010606@libero.it>
Message-ID: <20071003192055.49B203A407A@sparrow.telecommunity.com>

At 08:24 PM 10/3/2007 +0200, Manlio Perillo wrote:
>WSGI already
>states that the WSGI gateway should not buffer the data).

It does not state that at all.  It states that a gateway *must not 
delay the transmission of any block*.  That requirement is not a 
"should" but a "must", and it does not directly state anything about 
buffering, one way or the other.

It *does*, however, imply that buffering is only acceptable if the 
buffer is being asynchronously emptied, via another thread or the OS 
emptying its own OS-level buffers. (e.g. if you're using synchronous sockets)


>Note that in Nginx, unlike Apache, an output buffer can process a
>partial buffer, so, for a WSGI application like:
>
>     start_response('200 OK', [...])
>
>     yield 'xxx'
>     yield 'yyy'
>     yield 'zzz'
>
>
>the 'xxx' string is sent to the next output buffer, and, finally it is
>sent to the client.
>
>Now can happens that the socket is not ready to send further data, so
>the application must be paused until the socket is ready.
>
>When the socket is ready, the next buffer can be sent to the next outpup
>buffer, and so on.

In the above code, when "yield 'yyy'" is invoked, one of two 
conditions must apply.  Either:

1. the 'xxx' has been sent to the OS, OR
2. it is still being sent in the background by another thread

If it is possible to execute the "yield 'yyy'" line without one of 
these conditions applying, the gateway is *not* WSGI compliant.


From pje at telecommunity.com  Wed Oct  3 21:30:55 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 03 Oct 2007 15:30:55 -0400
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <4703ADE1.5040507@libero.it>
References: <4703ADE1.5040507@libero.it>
Message-ID: <20071003192817.3014C3A407A@sparrow.telecommunity.com>

At 04:57 PM 10/3/2007 +0200, Manlio Perillo wrote:
>A possibile solution can be to add an extension `x-wsgiorg.flush`, a
>callable object that notify the WSGI gateway that it can flush the
>headers (if they are not yet be sent) or the output buffer (Nginx has
>this feature, however I have yet not understand how it works).
>
>    start_response('200 Ok', [('Last-Modified', 'xxx')])
>
>    ...
>    environ['x-wsgiorg.flush']()
>
>    return a-generator
>
>
>The WSGI gateway can now send the headers before iterating over the
>generator, and if the client content is up-to-date, the new content is
>never generated.

Now that I understand what this is for, I can explain why a WSGI 
extension is not necessary to provide this feature.  In a compliant 
WSGI gateway, yielding an empty string from 'a-generator' is 
sufficient to "flush" the WSGI pipeline.

I suggest that you read this section of the spec more carefully:

http://www.python.org/dev/peps/pep-0333/#buffering-and-streaming


From manlio_perillo at libero.it  Wed Oct  3 21:52:01 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Wed, 03 Oct 2007 21:52:01 +0200
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <20071003192817.3014C3A407A@sparrow.telecommunity.com>
References: <4703ADE1.5040507@libero.it>
	<20071003192817.3014C3A407A@sparrow.telecommunity.com>
Message-ID: <4703F2E1.9050402@libero.it>

Phillip J. Eby ha scritto:
> [...]
> 
> Now that I understand what this is for, I can explain why a WSGI 
> extension is not necessary to provide this feature.  In a compliant WSGI 
> gateway, yielding an empty string from 'a-generator' is sufficient to 
> "flush" the WSGI pipeline.
> 

But the WSGI pipeline should already be flushed for every string 
yielded, right?

An interesting "extension" for an asynchronous WSGI gateway is to 
"suspend" the iteration when an empty string is returned, creating a 
timer that fires after 0 milliseconds (in Twisted, this is the same as 
callLater(0, ...))

> I suggest that you read this section of the spec more carefully:
> 
> http://www.python.org/dev/peps/pep-0333/#buffering-and-streaming
> 

There is a problem here: a WSGI gateway is not allowed to send headers 
until the app_iter yields a non empty string or the iterator is exausted.


Regards  Manlio Perillo

From manlio_perillo at libero.it  Wed Oct  3 21:58:24 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Wed, 03 Oct 2007 21:58:24 +0200
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <20071003192055.49B203A407A@sparrow.telecommunity.com>
References: <4703ADE1.5040507@libero.it>
	<20071003165020.23FAA3A407A@sparrow.telecommunity.com>
	<4703CB72.6080308@libero.it>
	<20071003175813.7DCEA3A407A@sparrow.telecommunity.com>
	<4703DE45.6010606@libero.it>
	<20071003192055.49B203A407A@sparrow.telecommunity.com>
Message-ID: <4703F460.4080401@libero.it>

Phillip J. Eby ha scritto:
> At 08:24 PM 10/3/2007 +0200, Manlio Perillo wrote:
>> WSGI already
>> states that the WSGI gateway should not buffer the data).
> 
> It does not state that at all.  It states that a gateway *must not delay 
> the transmission of any block*.  That requirement is not a "should" but 
> a "must", and it does not directly state anything about buffering, one 
> way or the other.
> 
> It *does*, however, imply that buffering is only acceptable if the 
> buffer is being asynchronously emptied, via another thread or the OS 
> emptying its own OS-level buffers. (e.g. if you're using synchronous 
> sockets)
> 

Ok.

> 
>> Note that in Nginx, unlike Apache, an output buffer can process a
>> partial buffer, so, for a WSGI application like:
>>
>>     start_response('200 OK', [...])
>>
>>     yield 'xxx'
>>     yield 'yyy'
>>     yield 'zzz'
>>
>>
>> the 'xxx' string is sent to the next output buffer, and, finally it is
>> sent to the client.
>>
>> Now can happens that the socket is not ready to send further data, so
>> the application must be paused until the socket is ready.
>>
>> When the socket is ready, the next buffer can be sent to the next outpup
>> buffer, and so on.
> 
> In the above code, when "yield 'yyy'" is invoked, one of two conditions 
> must apply.  Either:
> 
> 1. the 'xxx' has been sent to the OS, OR
> 2. it is still being sent in the background by another thread
> 
> If it is possible to execute the "yield 'yyy'" line without one of these 
> conditions applying, the gateway is *not* WSGI compliant.
> 

I'm not sure, but I think that the 'xxx' can be still in one of the 
output filter buffers (like gzip), unless we explicitly require it to be 
flushed.

Nginx does not use threads.


By the way: I think that the environ dictionary should contain a new
wsgi.asynchronous value, that should evaluate true if the WSGI gateway 
is asynchronous.

This may be necessary, because a WSGI application should know that it 
can be suspended, even if it not requested it.

> 


Regards  Manlio Perillo

From pje at telecommunity.com  Thu Oct  4 01:10:49 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed, 03 Oct 2007 19:10:49 -0400
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <4703F2E1.9050402@libero.it>
References: <4703ADE1.5040507@libero.it>
	<20071003192817.3014C3A407A@sparrow.telecommunity.com>
	<4703F2E1.9050402@libero.it>
Message-ID: <20071003230812.7A7F63A407A@sparrow.telecommunity.com>

At 09:52 PM 10/3/2007 +0200, Manlio Perillo wrote:
>Phillip J. Eby ha scritto:
> > [...]
> >
> > Now that I understand what this is for, I can explain why a WSGI
> > extension is not necessary to provide this feature.  In a compliant WSGI
> > gateway, yielding an empty string from 'a-generator' is sufficient to
> > "flush" the WSGI pipeline.
> >
>
>But the WSGI pipeline should already be flushed for every string
>yielded, right?
>
>An interesting "extension" for an asynchronous WSGI gateway is to
>"suspend" the iteration when an empty string is returned, creating a
>timer that fires after 0 milliseconds (in Twisted, this is the same as
>callLater(0, ...))
>
> > I suggest that you read this section of the spec more carefully:
> >
> > http://www.python.org/dev/peps/pep-0333/#buffering-and-streaming
> >
>
>There is a problem here: a WSGI gateway is not allowed to send headers
>until the app_iter yields a non empty string or the iterator is exausted.

Argh.  You're right.  I forgot about that bit.  It has been a few too 
many years since I worked on the spec.  :)

Still, this is yet another example of why WSGI 2.0 is a big 
improvement in simplicity.  So I still would rather see more effort 
put into getting WSGI 2.0 written and into widespread use, than 
creating niche extensions to 1.0.


From ianb at colorstudy.com  Thu Oct  4 01:13:49 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed, 03 Oct 2007 19:13:49 -0400
Subject: [Web-SIG] WSGI 2.0
Message-ID: <4704222D.30208@colorstudy.com>

PJE wants to talk about WSGI 2.  That's cool; I remind everyone that 
there's a page to bring up issues you might want to discuss for 2.0: 
http://wsgi.org/wsgi/WSGI_2.0

Feel free to add to, or discuss, issues on that page...

   Ian

From graham.dumpleton at gmail.com  Thu Oct  4 04:30:28 2007
From: graham.dumpleton at gmail.com (Graham Dumpleton)
Date: Thu, 4 Oct 2007 12:30:28 +1000
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <20071003230812.7A7F63A407A@sparrow.telecommunity.com>
References: <4703ADE1.5040507@libero.it>
	<20071003192817.3014C3A407A@sparrow.telecommunity.com>
	<4703F2E1.9050402@libero.it>
	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>
Message-ID: <88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com>

On 04/10/2007, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 09:52 PM 10/3/2007 +0200, Manlio Perillo wrote:
> >Phillip J. Eby ha scritto:
> > > [...]
> > >
> > > Now that I understand what this is for, I can explain why a WSGI
> > > extension is not necessary to provide this feature.  In a compliant WSGI
> > > gateway, yielding an empty string from 'a-generator' is sufficient to
> > > "flush" the WSGI pipeline.
> > >
> >
> >But the WSGI pipeline should already be flushed for every string
> >yielded, right?
> >
> >An interesting "extension" for an asynchronous WSGI gateway is to
> >"suspend" the iteration when an empty string is returned, creating a
> >timer that fires after 0 milliseconds (in Twisted, this is the same as
> >callLater(0, ...))
> >
> > > I suggest that you read this section of the spec more carefully:
> > >
> > > http://www.python.org/dev/peps/pep-0333/#buffering-and-streaming
> > >
> >
> >There is a problem here: a WSGI gateway is not allowed to send headers
> >until the app_iter yields a non empty string or the iterator is exausted.
>
> Argh.  You're right.  I forgot about that bit.  It has been a few too
> many years since I worked on the spec.  :)

The actual wording of the PEP does though suggest that if one calls
write() returned from start_response() that one would flush headers.
Ie., the requirement for a non-empty string is really only mentioned
in reference to value returned from iterable and not in relation to
empty data string passed to write().

I am not sure I understand the importance of being strict and not
flushing headers until the first non-empty content data block. Was
there a specific reasoning or use case behind saying that?

Graham

From manlio_perillo at libero.it  Thu Oct  4 10:57:08 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Thu, 04 Oct 2007 10:57:08 +0200
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <20071003230812.7A7F63A407A@sparrow.telecommunity.com>
References: <4703ADE1.5040507@libero.it>
	<20071003192817.3014C3A407A@sparrow.telecommunity.com>
	<4703F2E1.9050402@libero.it>
	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>
Message-ID: <4704AAE4.1010708@libero.it>

Phillip J. Eby ha scritto:
> [...]
>> There is a problem here: a WSGI gateway is not allowed to send headers
>> until the app_iter yields a non empty string or the iterator is exausted.
> 
> Argh.  You're right.  I forgot about that bit.  It has been a few too 
> many years since I worked on the spec.  :)
> 

07-Dec-2003!
And yet it seems that WSGI is not pervasively used.

> Still, this is yet another example of why WSGI 2.0 is a big improvement 
> in simplicity.  So I still would rather see more effort put into getting 
> WSGI 2.0 written and into widespread use, than creating niche extensions 
> to 1.0.


My implementation of mod_wsgi for nginx implements WSGI 2.0, and now I'm 
removing the limitation that the app_iter must yield only one item.

However there is a problem with WSGI 2.0.

Suppose that I execute an asynchronous HTTP request to obtain some data 
from a remote server.

I can use the yet to be implemented wsgi.pause_output extension for 
this, or an extension for interfacing with nginx subrequest API.

What happens if the HTTP request returns a 404 and I want to return this 
status code to the original client?

This can be done in WSGI 1.0 (since I can call start_response in the 
app_iter generator) but cannot be done in WSGI 2.0.

A possibile solution for WSGI 2.0 is to add a wsgi.response_error exception:

    raise environ['wsgi.response_error'](status='404 Not Found)

However there is still the problem with the headers.


Regards  Manlio Perillo

From pje at telecommunity.com  Thu Oct  4 13:47:15 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 04 Oct 2007 07:47:15 -0400
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.co
 m>
References: <4703ADE1.5040507@libero.it>
	<20071003192817.3014C3A407A@sparrow.telecommunity.com>
	<4703F2E1.9050402@libero.it>
	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>
	<88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com>
Message-ID: <20071004114441.C7B103A407A@sparrow.telecommunity.com>

At 12:30 PM 10/4/2007 +1000, Graham Dumpleton wrote:
>On 04/10/2007, Phillip J. Eby <pje at telecommunity.com> wrote:
> > At 09:52 PM 10/3/2007 +0200, Manlio Perillo wrote:
> > >There is a problem here: a WSGI gateway is not allowed to send headers
> > >until the app_iter yields a non empty string or the iterator is exausted.
> >
> > Argh.  You're right.  I forgot about that bit.  It has been a few too
> > many years since I worked on the spec.  :)
>
>The actual wording of the PEP does though suggest that if one calls
>write() returned from start_response() that one would flush headers.
>Ie., the requirement for a non-empty string is really only mentioned
>in reference to value returned from iterable and not in relation to
>empty data string passed to write().
>
>I am not sure I understand the importance of being strict and not
>flushing headers until the first non-empty content data block. Was
>there a specific reasoning or use case behind saying that?

The idea was to allow an application to change its mind about the 
headers until it had committed to writing data.  That is, to allow 
the application to do error handling for as long as possible before 
the server has to do it.

For WSGI 2.0, I'm no longer concerned about it - in the common case, 
the body will be a list or tuple containing a single string, so it 
can't possibly raise an error.  For anything more complex, well, you 
were going to have to handle error conditions once you yielded some 
body output anyway.

Now that you're mentioning it, the "non-empty yield" requirement 
seems pretty bogus, since it's not really possible for the app to 
tell whether headers have been sent anyway; start_response() handles 
that transparently.

Only problem is that the PEP examples and wsgiref aren't written to 
support doing it that way, so I don't think we can reasonably change 
it in WSGI 1.0, and in 2.0 it won't even matter.


From pje at telecommunity.com  Thu Oct  4 13:54:43 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 04 Oct 2007 07:54:43 -0400
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <4704AAE4.1010708@libero.it>
References: <4703ADE1.5040507@libero.it>
	<20071003192817.3014C3A407A@sparrow.telecommunity.com>
	<4703F2E1.9050402@libero.it>
	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>
	<4704AAE4.1010708@libero.it>
Message-ID: <20071004115207.65C463A407A@sparrow.telecommunity.com>

At 10:57 AM 10/4/2007 +0200, Manlio Perillo wrote:
>Phillip J. Eby ha scritto:
> > [...]
> >> There is a problem here: a WSGI gateway is not allowed to send headers
> >> until the app_iter yields a non empty string or the iterator is exausted.
> >
> > Argh.  You're right.  I forgot about that bit.  It has been a few too
> > many years since I worked on the spec.  :)
> >
>
>07-Dec-2003!
>And yet it seems that WSGI is not pervasively used.

What do you mean?  Can you name a popular Python web framework or 
library that doesn't either use or support WSGI?


> > Still, this is yet another example of why WSGI 2.0 is a big improvement
> > in simplicity.  So I still would rather see more effort put into getting
> > WSGI 2.0 written and into widespread use, than creating niche extensions
> > to 1.0.
>
>
>My implementation of mod_wsgi for nginx implements WSGI 2.0, and now I'm
>removing the limitation that the app_iter must yield only one item.

Eh?  I don't understand what you mean by "app_iter must yield only 
one item".  In WSGI 2.0 the application return signature is a 
three-item tuple, the third item of which is a WSGI 1.0 response object.


>However there is a problem with WSGI 2.0.
>
>Suppose that I execute an asynchronous HTTP request to obtain some data
>from a remote server.
>
>I can use the yet to be implemented wsgi.pause_output extension for
>this, or an extension for interfacing with nginx subrequest API.

That won't be possible in WSGI 2.0 - it's a purely synchronous 
API.  You can pause body output by yielding empty strings, but you 
need to have already decided on your headers.


>What happens if the HTTP request returns a 404 and I want to return this
>status code to the original client?
>
>This can be done in WSGI 1.0 (since I can call start_response in the
>app_iter generator) but cannot be done in WSGI 2.0.

In WSGI 1.0, that can only happen up until the point where you've 
yielded body output.  As soon as there is any body output, the 
headers are committed.  In 2.0, you will have to commit your headers 
at return time.

Note, by the way, that WSGI 2.0 isn't going to be an immediate or 
complete replacement for 1.0 -- especially since the spec isn't 
written yet!  1.0 apps and servers will likely be with us for a few years yet.


From graham.dumpleton at gmail.com  Thu Oct  4 14:20:55 2007
From: graham.dumpleton at gmail.com (Graham Dumpleton)
Date: Thu, 4 Oct 2007 22:20:55 +1000
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <20071004114441.C7B103A407A@sparrow.telecommunity.com>
References: <4703ADE1.5040507@libero.it>
	<20071003192817.3014C3A407A@sparrow.telecommunity.com>
	<4703F2E1.9050402@libero.it>
	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>
	<88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com>
	<20071004114441.C7B103A407A@sparrow.telecommunity.com>
Message-ID: <88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com>

On 04/10/2007, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 12:30 PM 10/4/2007 +1000, Graham Dumpleton wrote:
> >On 04/10/2007, Phillip J. Eby <pje at telecommunity.com> wrote:
> > > At 09:52 PM 10/3/2007 +0200, Manlio Perillo wrote:
> > > >There is a problem here: a WSGI gateway is not allowed to send headers
> > > >until the app_iter yields a non empty string or the iterator is exausted.
> > >
> > > Argh.  You're right.  I forgot about that bit.  It has been a few too
> > > many years since I worked on the spec.  :)
> >
> >The actual wording of the PEP does though suggest that if one calls
> >write() returned from start_response() that one would flush headers.
> >Ie., the requirement for a non-empty string is really only mentioned
> >in reference to value returned from iterable and not in relation to
> >empty data string passed to write().
> >
> >I am not sure I understand the importance of being strict and not
> >flushing headers until the first non-empty content data block. Was
> >there a specific reasoning or use case behind saying that?
>
> The idea was to allow an application to change its mind about the
> headers until it had committed to writing data.  That is, to allow
> the application to do error handling for as long as possible before
> the server has to do it.

But once you have called start_response() you cant call it a second
time to change the values so how could the application change its
mind? If you are delaying calling start_response() in the first place
it is a moot point as you cant be writing data until you do so.

> For WSGI 2.0, I'm no longer concerned about it - in the common case,
> the body will be a list or tuple containing a single string, so it
> can't possibly raise an error.  For anything more complex, well, you
> were going to have to handle error conditions once you yielded some
> body output anyway.
>
> Now that you're mentioning it, the "non-empty yield" requirement
> seems pretty bogus, since it's not really possible for the app to
> tell whether headers have been sent anyway; start_response() handles
> that transparently.
>
> Only problem is that the PEP examples and wsgiref aren't written to
> support doing it that way, so I don't think we can reasonably change
> it in WSGI 1.0, and in 2.0 it won't even matter.

Huh, change what in WSGI 1.0. As you seem to note the CGI example in
the PEP does flush headers even if first data block was an empty
string and quite likely that other implementations have copied from
that and not implemented the WSGI specification as written.

As to Apache mod_wsgi, if using Apache 1.3 it would flush headers if
first data block output was empty where as in Apache 2.X it will only
flush when first non empty data block is yielded, but also wouldn't
flush if write() was being called. That in Apache 2.X it doesn't flush
headers until first non empty data block is output wasn't by design,
that is just how Apache works under the covers.

So most likely no one probably gets it exactly right per spec, but in
practice it probably doesn't matter anyway and isn't going to affect
how anything works.

Graham

From pje at telecommunity.com  Thu Oct  4 15:10:53 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 04 Oct 2007 09:10:53 -0400
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.co
 m>
References: <4703ADE1.5040507@libero.it>
	<20071003192817.3014C3A407A@sparrow.telecommunity.com>
	<4703F2E1.9050402@libero.it>
	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>
	<88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com>
	<20071004114441.C7B103A407A@sparrow.telecommunity.com>
	<88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com>
Message-ID: <20071004130818.BFCE83A407A@sparrow.telecommunity.com>

At 10:20 PM 10/4/2007 +1000, Graham Dumpleton wrote:
>On 04/10/2007, Phillip J. Eby <pje at telecommunity.com> wrote:
> > At 12:30 PM 10/4/2007 +1000, Graham Dumpleton wrote:
> > >On 04/10/2007, Phillip J. Eby <pje at telecommunity.com> wrote:
> > > > At 09:52 PM 10/3/2007 +0200, Manlio Perillo wrote:
> > > > >There is a problem here: a WSGI gateway is not allowed to send headers
> > > > >until the app_iter yields a non empty string or the iterator 
> is exausted.
> > > >
> > > > Argh.  You're right.  I forgot about that bit.  It has been a few too
> > > > many years since I worked on the spec.  :)
> > >
> > >The actual wording of the PEP does though suggest that if one calls
> > >write() returned from start_response() that one would flush headers.
> > >Ie., the requirement for a non-empty string is really only mentioned
> > >in reference to value returned from iterable and not in relation to
> > >empty data string passed to write().
> > >
> > >I am not sure I understand the importance of being strict and not
> > >flushing headers until the first non-empty content data block. Was
> > >there a specific reasoning or use case behind saying that?
> >
> > The idea was to allow an application to change its mind about the
> > headers until it had committed to writing data.  That is, to allow
> > the application to do error handling for as long as possible before
> > the server has to do it.
>
>But once you have called start_response() you cant call it a second
>time to change the values

You can, as long as you pass in the exception info -- because an 
exception is the only legitimate reason to change the values.


> > For WSGI 2.0, I'm no longer concerned about it - in the common case,
> > the body will be a list or tuple containing a single string, so it
> > can't possibly raise an error.  For anything more complex, well, you
> > were going to have to handle error conditions once you yielded some
> > body output anyway.
> >
> > Now that you're mentioning it, the "non-empty yield" requirement
> > seems pretty bogus, since it's not really possible for the app to
> > tell whether headers have been sent anyway; start_response() handles
> > that transparently.
> >
> > Only problem is that the PEP examples and wsgiref aren't written to
> > support doing it that way, so I don't think we can reasonably change
> > it in WSGI 1.0, and in 2.0 it won't even matter.
>
>Huh, change what in WSGI 1.0. As you seem to note the CGI example in
>the PEP does flush headers even if first data block was an empty
>string

Actually, the PEP example skips empty strings yielded by the 
app_iter.  wsgiref.handlers, OTOH, doesn't do this, now that I've checked it.


>and quite likely that other implementations have copied from
>that and not implemented the WSGI specification as written.

Correct WSGI 1.0 implementations are unfortunately rare.  Even 
wsgiref gets it wrong.  :(


>So most likely no one probably gets it exactly right per spec,

No kidding!

>but in
>practice it probably doesn't matter anyway and isn't going to affect
>how anything works.

Yep, but another argument in favor of getting rid of as much 
statefulness from the protocol as we can.  Making the status and 
headers part of the return value entirely eliminates the question of 
when they're going to get written, and whether they can be changed.

(As a side benefit, making the return a 3-tuple makes it impossible 
to write a WSGI app using a single generator -- thereby discouraging 
people from using 'yield' like it was a CGI "print".)


From manlio_perillo at libero.it  Thu Oct  4 15:47:06 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Thu, 04 Oct 2007 15:47:06 +0200
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <4704222D.30208@colorstudy.com>
References: <4704222D.30208@colorstudy.com>
Message-ID: <4704EEDA.1010800@libero.it>

Ian Bicking ha scritto:
> PJE wants to talk about WSGI 2.  That's cool; I remind everyone that 
> there's a page to bring up issues you might want to discuss for 2.0: 
> http://wsgi.org/wsgi/WSGI_2.0
> 
> Feel free to add to, or discuss, issues on that page...
> 

I'll write my ideas here:
1) start_response should no more return a write callable.
    I don't know how many application use it, but I think that
    I can't implement it in a conforming way for nginx mod_wsgi,
    so I will not implement it.

2) start_response should no more accept a exc_info parameter.
    I don't know how many applications use it, but I think that
    WSGI applications should not change their mind.
    They should delay calling start_response until they are able
    to produce a "final" response.

3) start_response should accept, as an optional parameter, a
    flush argument.
    flush default to False, and when it is True, the WSGI gateway
    must write the headers as soon as start_response is called.

4) the environ dictionary should have a new WSGI-defined variable:
    wsgi.asynchronous.
    This value should evaluate to true when the server is asynchonous,
    that is, the WSGI application is executed in the main process loop
    of the server and the WSGI application can be paused after it yields
    some data.

5) clarify some points in the WSGI 1.0 spec, as discussed in the latest
    emails

>    Ian


Regards  Manlio Perillo

From manlio_perillo at libero.it  Thu Oct  4 15:53:04 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Thu, 04 Oct 2007 15:53:04 +0200
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <20071004115207.65C463A407A@sparrow.telecommunity.com>
References: <4703ADE1.5040507@libero.it>
	<20071003192817.3014C3A407A@sparrow.telecommunity.com>
	<4703F2E1.9050402@libero.it>
	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>
	<4704AAE4.1010708@libero.it>
	<20071004115207.65C463A407A@sparrow.telecommunity.com>
Message-ID: <4704F040.10105@libero.it>

Phillip J. Eby ha scritto:
> At 10:57 AM 10/4/2007 +0200, Manlio Perillo wrote:
>> Phillip J. Eby ha scritto:
>> > [...]
>> >> There is a problem here: a WSGI gateway is not allowed to send headers
>> >> until the app_iter yields a non empty string or the iterator is 
>> exausted.
>> >
>> > Argh.  You're right.  I forgot about that bit.  It has been a few too
>> > many years since I worked on the spec.  :)
>> >
>>
>> 07-Dec-2003!
>> And yet it seems that WSGI is not pervasively used.
> 
> What do you mean?  Can you name a popular Python web framework or 
> library that doesn't either use or support WSGI?
> 

Django, as an example, uses WSGI "only as a backend".
Django design is not based on WSGI, it is WSGI that is adapted for Django.

An interesting example: to add support for CGI, it seems that the 
preferred method is to add a direct Django adapter for CGI, instead of 
using a WSGI adatper for CGI.


> 
>> > Still, this is yet another example of why WSGI 2.0 is a big improvement
>> > in simplicity.  So I still would rather see more effort put into 
>> getting
>> > WSGI 2.0 written and into widespread use, than creating niche 
>> extensions
>> > to 1.0.
>>
>>
>> My implementation of mod_wsgi for nginx implements WSGI 2.0, and now I'm
>> removing the limitation that the app_iter must yield only one item.
> 
> Eh?  I don't understand what you mean by "app_iter must yield only one 
> item".  


   return '200 OK', [('Content-Type', 'text/plain')], ['a', 'b']

is not allowed.
The response object can be a generic iterator, however.

> In WSGI 2.0 the application return signature is a three-item 
> tuple, the third item of which is a WSGI 1.0 response object.
> 
> 
>> However there is a problem with WSGI 2.0.
>>
>> Suppose that I execute an asynchronous HTTP request to obtain some data
>> from a remote server.
>>
>> I can use the yet to be implemented wsgi.pause_output extension for
>> this, or an extension for interfacing with nginx subrequest API.
> 
> That won't be possible in WSGI 2.0 - it's a purely synchronous API. 

This is the reason why I don't like WSGI 2.0 :).

However I have to admit that developing a full asynchronous application 
is not easy, notably when we have to interact with a database and a 
transaction.

It is really so hard to implement WSGI 1.0 and to write middlewares for it?
Is this really causing problems for WSGI adoption?

I think that WSGI 2.0 should simply correct some problems in WSGI 1.0, 
and clarify some points, since now we have a WSGI implementation for 
Apache and Nginx.


 > You
> can pause body output by yielding empty strings, but you need to have 
> already decided on your headers.
> 

And this will make asynchronous applications not really useful, IMHO...
But here I will say more once I'll implement some asynchronous 
extensions for nginx mod_wsgi.

It's very unfortunate that the WSGI implementation in Twisted just uses 
threads instead of doing some experimentation.

 > [...]


Regards  Manlio Perillo

From manlio_perillo at libero.it  Thu Oct  4 16:10:39 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Thu, 04 Oct 2007 16:10:39 +0200
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com>
References: <4703ADE1.5040507@libero.it>	
	<20071003192817.3014C3A407A@sparrow.telecommunity.com>	
	<4703F2E1.9050402@libero.it>	
	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>	
	<88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com>	
	<20071004114441.C7B103A407A@sparrow.telecommunity.com>
	<88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com>
Message-ID: <4704F45F.8020301@libero.it>

Graham Dumpleton ha scritto:
> [...]
>> The idea was to allow an application to change its mind about the
>> headers until it had committed to writing data.  That is, to allow
>> the application to do error handling for as long as possible before
>> the server has to do it.
> 
> But once you have called start_response() you cant call it a second
> time to change the values so how could the application change its
> mind? 

In my implementation of WSGI for nginx, start_response setups the 
headers on the request object, but calls ngx_http_send_header only when 
the first not empty string is yielded.

This means that if an error occurs, the "old" headers are kept in the 
response (and sent to the client); nginx will simply change the status 
code to '500 INTERNAL ERROR'.

A solution can be to copy the headers in a temporary request object, but 
I don't know if this is possible.

Another solution is to setup the headers and call send_headers at the 
same time, but in this way it is no more possible to raise an exception 
when the application calls start_response with incorrect headers.

If I'm right this is the solution used by Apache mod_wsgi.

[...]


Regards  Manlio Perillo

From pje at telecommunity.com  Thu Oct  4 16:29:27 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 04 Oct 2007 10:29:27 -0400
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <4704EEDA.1010800@libero.it>
References: <4704222D.30208@colorstudy.com>
 <4704EEDA.1010800@libero.it>
Message-ID: <20071004142648.D2AFB3A407A@sparrow.telecommunity.com>

At 03:47 PM 10/4/2007 +0200, Manlio Perillo wrote:
>Ian Bicking ha scritto:
> > PJE wants to talk about WSGI 2.  That's cool; I remind everyone that
> > there's a page to bring up issues you might want to discuss for 2.0:
> > http://wsgi.org/wsgi/WSGI_2.0
> >
> > Feel free to add to, or discuss, issues on that page...
> >
>
>I'll write my ideas here:
>1) start_response should no more return a write callable.
>     I don't know how many application use it, but I think that
>     I can't implement it in a conforming way for nginx mod_wsgi,
>     so I will not implement it.
>
>2) start_response should no more accept a exc_info parameter.
>     I don't know how many applications use it, but I think that
>     WSGI applications should not change their mind.
>     They should delay calling start_response until they are able
>     to produce a "final" response.
>
>3) start_response should accept, as an optional parameter, a
>     flush argument.
>     flush default to False, and when it is True, the WSGI gateway
>     must write the headers as soon as start_response is called.

WSGI 2.0 does not have a start_response() callable in the first 
place, so none of these apply.

In WSGI 2.0, an application looks like this:

     def an_app(environ):
         return "200 OK", [('content-type', 'text/plain')], ["Hello, world!"]

i.e., no start_response(), no write(), no statefulness at all.  It 
just returns a tuple of (status, headers, iterable), where all three 
are defined by the WSGI 1.0 spec.

The third item in the tuple is a WSGI 1.0 app_iter, so it can be a 
generator, have a close() method, etc.  Here's a WSGI 1 middleware 
application that converts a WSGI 2 application to WSGI 1:

         def wsgi_1_app(environ, start_response):
             status, headers, body = wsgi_2_app(environ)
             start_response(status, headers)
             return body

In other words, WSGI 2 is basically WSGI 1 with start_response() and 
write() taken out.


>4) the environ dictionary should have a new WSGI-defined variable:
>     wsgi.asynchronous.
>     This value should evaluate to true when the server is asynchonous,
>     that is, the WSGI application is executed in the main process loop
>     of the server and the WSGI application can be paused after it yields
>     some data.

It's always the case that a WSGI application can be paused after it 
yields data, even in WSGI 1.0.


From pje at telecommunity.com  Thu Oct  4 16:37:58 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 04 Oct 2007 10:37:58 -0400
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <4704F040.10105@libero.it>
References: <4703ADE1.5040507@libero.it>
	<20071003192817.3014C3A407A@sparrow.telecommunity.com>
	<4703F2E1.9050402@libero.it>
	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>
	<4704AAE4.1010708@libero.it>
	<20071004115207.65C463A407A@sparrow.telecommunity.com>
	<4704F040.10105@libero.it>
Message-ID: <20071004143521.58AE53A407A@sparrow.telecommunity.com>

At 03:53 PM 10/4/2007 +0200, Manlio Perillo wrote:
>Phillip J. Eby ha scritto:
> > At 10:57 AM 10/4/2007 +0200, Manlio Perillo wrote:
> >> Phillip J. Eby ha scritto:
> >> > [...]
> >> >> There is a problem here: a WSGI gateway is not allowed to send headers
> >> >> until the app_iter yields a non empty string or the iterator is
> >> exausted.
> >> >
> >> > Argh.  You're right.  I forgot about that bit.  It has been a few too
> >> > many years since I worked on the spec.  :)
> >> >
> >>
> >> 07-Dec-2003!
> >> And yet it seems that WSGI is not pervasively used.
> >
> > What do you mean?  Can you name a popular Python web framework or
> > library that doesn't either use or support WSGI?
> >
>
>Django, as an example, uses WSGI "only as a backend".

That's still WSGI *support*.


>Django design is not based on WSGI, it is WSGI that is adapted for Django.

Yep - which is why we need WSGI 2.  WSGI 1 achieved all its goals 
*except* for being easy to write middleware and build frameworks on 
it.  It should be easier to use WSGI than to not use it.


> > That won't be possible in WSGI 2.0 - it's a purely synchronous API.
>
>This is the reason why I don't like WSGI 2.0 :).
>
>However I have to admit that developing a full asynchronous application
>is not easy, notably when we have to interact with a database and a
>transaction.

Right - in practice, there is not enough of a common async API for 
Python to make it practical to implement asynchronousness in WSGI 
itself.  At least, in the last three years nobody has made a 
practical proposal for it.  In practice, if you want to write a 
fully-async web app you must use Twisted or a similar framework and 
commit to using its API.  You can of course still use WSGI 
components, but your application will not be able to run on a server 
that doesn't provide your async framework's API.


>It is really so hard to implement WSGI 1.0 and to write middlewares for it?

Absolutely.  Most of the time I see someone post example middleware 
code, it is not WSGI compliant in some fashion.


>I think that WSGI 2.0 should simply correct some problems in WSGI 1.0,

The single biggest problem in WSGI 1.0 is start_response() and 
write().  They were hacks to support legacy applications and frameworks.


>It's very unfortunate that the WSGI implementation in Twisted just uses
>threads instead of doing some experimentation.

You're making the assumption that no experimentation was done.  Check 
the Web-SIG archives from three years ago and see the discussions 
about async APIs.


From pje at telecommunity.com  Thu Oct  4 16:44:15 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 04 Oct 2007 10:44:15 -0400
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <4704F45F.8020301@libero.it>
References: <4703ADE1.5040507@libero.it>
	<20071003192817.3014C3A407A@sparrow.telecommunity.com>
	<4703F2E1.9050402@libero.it>
	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>
	<88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com>
	<20071004114441.C7B103A407A@sparrow.telecommunity.com>
	<88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com>
	<4704F45F.8020301@libero.it>
Message-ID: <20071004144136.DC6AD3A407B@sparrow.telecommunity.com>

At 04:10 PM 10/4/2007 +0200, Manlio Perillo wrote:
>Graham Dumpleton ha scritto:
> > [...]
> >> The idea was to allow an application to change its mind about the
> >> headers until it had committed to writing data.  That is, to allow
> >> the application to do error handling for as long as possible before
> >> the server has to do it.
> >
> > But once you have called start_response() you cant call it a second
> > time to change the values so how could the application change its
> > mind?
>
>In my implementation of WSGI for nginx, start_response setups the
>headers on the request object, but calls ngx_http_send_header only when
>the first not empty string is yielded.
>
>This means that if an error occurs, the "old" headers are kept in the
>response (and sent to the client); nginx will simply change the status
>code to '500 INTERNAL ERROR'.

It's not clear to me from this statement whether you're supporting 
the exc_info argument as described here:

http://www.python.org/dev/peps/pep-0333/#the-start-response-callable

and here:

http://www.python.org/dev/peps/pep-0333/#error-handling


From manlio_perillo at libero.it  Thu Oct  4 16:48:18 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Thu, 04 Oct 2007 16:48:18 +0200
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
Message-ID: <4704FD32.9020604@libero.it>

Phillip J. Eby ha scritto:
> [...]
> 
> WSGI 2.0 does not have a start_response() callable in the first place, 
> so none of these apply.
> 

I thought that the current WSGI 2.0 draft was only, indeed, a draft.
 >
 > [...]
>> 4) the environ dictionary should have a new WSGI-defined variable:
>>     wsgi.asynchronous.
>>     This value should evaluate to true when the server is asynchonous,
>>     that is, the WSGI application is executed in the main process loop
>>     of the server and the WSGI application can be paused after it yields
>>     some data.
> 
> It's always the case that a WSGI application can be paused after it 
> yields data, even in WSGI 1.0.

I was not aware of this.
It may cause some problems to a unaware WSGI application the fact that a 
new "handler" is started "interleaved" with the previous ones.


Regards  Manlio Perillo

From manlio_perillo at libero.it  Thu Oct  4 17:00:35 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Thu, 04 Oct 2007 17:00:35 +0200
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <20071004143521.58AE53A407A@sparrow.telecommunity.com>
References: <4703ADE1.5040507@libero.it>
	<20071003192817.3014C3A407A@sparrow.telecommunity.com>
	<4703F2E1.9050402@libero.it>
	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>
	<4704AAE4.1010708@libero.it>
	<20071004115207.65C463A407A@sparrow.telecommunity.com>
	<4704F040.10105@libero.it>
	<20071004143521.58AE53A407A@sparrow.telecommunity.com>
Message-ID: <47050013.3070009@libero.it>

Phillip J. Eby ha scritto:
> [...]
>
>> However I have to admit that developing a full asynchronous application
>> is not easy, notably when we have to interact with a database and a
>> transaction.
> 
> Right - in practice, there is not enough of a common async API for 
> Python to make it practical to implement asynchronousness in WSGI 
> itself.  At least, in the last three years nobody has made a practical 
> proposal for it.  In practice, if you want to write a fully-async web 
> app you must use Twisted or a similar framework and commit to using its 
> API.  

I want to add asynchronous API support to nginx mod_wsgi because I 
*want* to use a more agile web server for my applications, using Twisted 
only when I need an enterprise environment!

> You can of course still use WSGI components, but your application 
> will not be able to run on a server that doesn't provide your async 
> framework's API.
> 

That's not a problem.
Asynchronous support will be available in nginx mod_wsgi and in Twisted 
(if I found the time to write an alternative implementation of the WSGI 
support, but this is not a priority for me).


> 
>> It is really so hard to implement WSGI 1.0 and to write middlewares 
>> for it?
> 
> Absolutely.  Most of the time I see someone post example middleware 
> code, it is not WSGI compliant in some fashion.
> 

Your are making a critical decision here.
You are lowering the level of WSGI to match the level of average WSGI 
middlewares programmers.

This can have disastrous conseguences if Python will gain a large user 
base in the future (and, of course, with a large user base, the majority 
of the users will have a low profile).

> 
>> It's very unfortunate that the WSGI implementation in Twisted just uses
>> threads instead of doing some experimentation.
> 
> You're making the assumption that no experimentation was done.  Check 
> the Web-SIG archives from three years ago and see the discussions about 
> async APIs.

No.
I have read a lot of archived messages, and all I have seen are 
*discussions* about asynchronous extensions, but no working implementations.


Regards   Manlio Perillo

From manlio_perillo at libero.it  Thu Oct  4 17:02:29 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Thu, 04 Oct 2007 17:02:29 +0200
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <20071004144136.DC6AD3A407B@sparrow.telecommunity.com>
References: <4703ADE1.5040507@libero.it>
	<20071003192817.3014C3A407A@sparrow.telecommunity.com>
	<4703F2E1.9050402@libero.it>
	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>
	<88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com>
	<20071004114441.C7B103A407A@sparrow.telecommunity.com>
	<88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com>
	<4704F45F.8020301@libero.it>
	<20071004144136.DC6AD3A407B@sparrow.telecommunity.com>
Message-ID: <47050085.4070502@libero.it>

Phillip J. Eby ha scritto:
> At 04:10 PM 10/4/2007 +0200, Manlio Perillo wrote:
>> Graham Dumpleton ha scritto:
>> > [...]
>> >> The idea was to allow an application to change its mind about the
>> >> headers until it had committed to writing data.  That is, to allow
>> >> the application to do error handling for as long as possible before
>> >> the server has to do it.
>> >
>> > But once you have called start_response() you cant call it a second
>> > time to change the values so how could the application change its
>> > mind?
>>
>> In my implementation of WSGI for nginx, start_response setups the
>> headers on the request object, but calls ngx_http_send_header only when
>> the first not empty string is yielded.
>>
>> This means that if an error occurs, the "old" headers are kept in the
>> response (and sent to the client); nginx will simply change the status
>> code to '500 INTERNAL ERROR'.
> 
> It's not clear to me from this statement whether you're supporting the 
> exc_info argument as described here:
> 

No, since the current nginx mod_wsgi implementation, as I have already 
written, only supports the WSGI 2.0 draft.


Regards  Manlio Perillo

From pje at telecommunity.com  Thu Oct  4 17:40:08 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 04 Oct 2007 11:40:08 -0400
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <4704FD32.9020604@libero.it>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
Message-ID: <20071004153734.1DFA33A407A@sparrow.telecommunity.com>

At 04:48 PM 10/4/2007 +0200, Manlio Perillo wrote:
>Phillip J. Eby ha scritto:
> > [...]
> >
> > WSGI 2.0 does not have a start_response() callable in the first place,
> > so none of these apply.
> >
>
>I thought that the current WSGI 2.0 draft was only, indeed, a draft.

That's correct.  But eliminating start_response() and write() is 
really the main point of *having* a WSGI 2.0.


> > It's always the case that a WSGI application can be paused after it
> > yields data, even in WSGI 1.0.
>
>I was not aware of this.
>It may cause some problems to a unaware WSGI application the fact that a
>new "handler" is started "interleaved" with the previous ones.

It may... but the only applications that should be yielding anything 
are ones that are sending large files, doing server push, or 
explicitly *desire* to be interleaved in such fashion.

If your app isn't in one of those categories, you should just be 
yielding a single string to begin with.


From pje at telecommunity.com  Thu Oct  4 17:55:08 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 04 Oct 2007 11:55:08 -0400
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <47050013.3070009@libero.it>
References: <4703ADE1.5040507@libero.it>
	<20071003192817.3014C3A407A@sparrow.telecommunity.com>
	<4703F2E1.9050402@libero.it>
	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>
	<4704AAE4.1010708@libero.it>
	<20071004115207.65C463A407A@sparrow.telecommunity.com>
	<4704F040.10105@libero.it>
	<20071004143521.58AE53A407A@sparrow.telecommunity.com>
	<47050013.3070009@libero.it>
Message-ID: <20071004155229.3D2303A407B@sparrow.telecommunity.com>

At 05:00 PM 10/4/2007 +0200, Manlio Perillo wrote:
>Your are making a critical decision here.
>You are lowering the level of WSGI to match the level of average WSGI
>middlewares programmers.

No, we're just getting rid of legacy cruft that's hard to support 
correctly.  There's a big difference.


>This can have disastrous conseguences if Python will gain a large user
>base in the future (and, of course, with a large user base, the majority
>of the users will have a low profile).

This seems to be arguing the opposite: making WSGI simpler is a 
*good* thing if there will be a larger user base.


> >> It's very unfortunate that the WSGI implementation in Twisted just uses
> >> threads instead of doing some experimentation.
> >
> > You're making the assumption that no experimentation was done.  Check
> > the Web-SIG archives from three years ago and see the discussions about
> > async APIs.
>
>No.
>I have read a lot of archived messages, and all I have seen are
>*discussions* about asynchronous extensions, but no working implementations.

Because nobody came up with anything particularly useful.  While it's 
possible to have generic extensions for pausing and resuming 
iteration, those aren't useful enough to write a fully asynchronous 
application.  You still have to block and/or poll in order to do 
anything else.  Meanwhile, since applications *can* block, they have 
to be in a separate thread or process from an async server 
anyway.  So all that asynchrony does is free up the thread or process 
to handle something else...  which is wasted if the app is not in an 
async server.

So, barring a radical alteration to the WSGI programming model, 
asynchronous programming is a bit of a dead-end.  To do async right, 
you really need a CPS (continuation-passing style) API, *and* you 
also need async APIs for whatever the app is going to *do*.

In other words, the absence of standard Python APIs for asynchronous 
I/O (be it socket, database, or otherwise) make it moot to add an 
async API to WSGI, since in practice the application will be 
locked-in to whatever async I/O API it uses.


From manlio_perillo at libero.it  Thu Oct  4 17:54:47 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Thu, 04 Oct 2007 17:54:47 +0200
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <20071004153734.1DFA33A407A@sparrow.telecommunity.com>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
Message-ID: <47050CC7.9030500@libero.it>

Phillip J. Eby ha scritto:
> At 04:48 PM 10/4/2007 +0200, Manlio Perillo wrote:
>> Phillip J. Eby ha scritto:
>> > [...]
>> >
>> > WSGI 2.0 does not have a start_response() callable in the first place,
>> > so none of these apply.
>> >
>>
>> I thought that the current WSGI 2.0 draft was only, indeed, a draft.
> 
> That's correct.  But eliminating start_response() and write() is really 
> the main point of *having* a WSGI 2.0.
> 

For me, what's needs to be elimitated is write() and the exc_info in 
start_response.

> 
>> > It's always the case that a WSGI application can be paused after it
>> > yields data, even in WSGI 1.0.
>>
>> I was not aware of this.
>> It may cause some problems to a unaware WSGI application the fact that a
>> new "handler" is started "interleaved" with the previous ones.
> 
> It may... but the only applications that should be yielding anything are 
> ones that are sending large files, doing server push, or explicitly 
> *desire* to be interleaved in such fashion.
> 

But they have no way to know if the server supports this, and existing 
WSGI implementations does not interleave the iteration, as far as I know.


> If your app isn't in one of those categories, you should just be 
> yielding a single string to begin with.


Regards  Manlio Perillo

From manlio_perillo at libero.it  Thu Oct  4 18:07:12 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Thu, 04 Oct 2007 18:07:12 +0200
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <20071004155229.3D2303A407B@sparrow.telecommunity.com>
References: <4703ADE1.5040507@libero.it>
	<20071003192817.3014C3A407A@sparrow.telecommunity.com>
	<4703F2E1.9050402@libero.it>
	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>
	<4704AAE4.1010708@libero.it>
	<20071004115207.65C463A407A@sparrow.telecommunity.com>
	<4704F040.10105@libero.it>
	<20071004143521.58AE53A407A@sparrow.telecommunity.com>
	<47050013.3070009@libero.it>
	<20071004155229.3D2303A407B@sparrow.telecommunity.com>
Message-ID: <47050FB0.6030202@libero.it>

Phillip J. Eby ha scritto:
> [...]
>> I have read a lot of archived messages, and all I have seen are
>> *discussions* about asynchronous extensions, but no working 
>> implementations.
> 
> Because nobody came up with anything particularly useful.  While it's 
> possible to have generic extensions for pausing and resuming iteration, 
> those aren't useful enough to write a fully asynchronous application.  
> You still have to block and/or poll in order to do anything else.  
> Meanwhile, since applications *can* block, they have to be in a separate 
> thread or process from an async server anyway.  So all that asynchrony 
> does is free up the thread or process to handle something else...  which 
> is wasted if the app is not in an async server.
> 

For nginx mod_wsgi I'm planning to add support to blocking 
application,executing them in a thread (*but* there will be only one 
thread per process, and the entire result will be buffered).

Threaded execution will be disabled by default, and can be enabled using 
an option.

To add support to asynchronous WSGI application, I will try to implement 
the pause_output extension and, more important, I will expose the nginx 
event API to the WSGI application, writing an extension module.

The API will be low level, but once this API will be implemented, it 
should be possibile to implement a common and standardized API, that 
will works with nginx mod_wsgi and Twisted.

 > [...]


Regards  Manlio Perillo

From pje at telecommunity.com  Thu Oct  4 18:20:50 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 04 Oct 2007 12:20:50 -0400
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <47050CC7.9030500@libero.it>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
Message-ID: <20071004161810.060183A407A@sparrow.telecommunity.com>

At 05:54 PM 10/4/2007 +0200, Manlio Perillo wrote:
>Phillip J. Eby ha scritto:
> > At 04:48 PM 10/4/2007 +0200, Manlio Perillo wrote:
> >> Phillip J. Eby ha scritto:
> >> > It's always the case that a WSGI application can be paused after it
> >> > yields data, even in WSGI 1.0.
> >>
> >> I was not aware of this.
> >> It may cause some problems to a unaware WSGI application the fact that a
> >> new "handler" is started "interleaved" with the previous ones.
> >
> > It may... but the only applications that should be yielding anything are
> > ones that are sending large files, doing server push, or explicitly
> > *desire* to be interleaved in such fashion.
> >
>
>But they have no way to know if the server supports this,

If it's a WSGI-compliant server, it supports this by 
definition.  It's just that synchronous servers don't pause before 
requesting the next iteration.


>  and existing
>WSGI implementations does not interleave the iteration, as far as I know.

Nothing in the spec stops them from doing so - indeed, they're 
*encouraged* to do so:

http://www.python.org/dev/peps/pep-0333/#middleware-handling-of-block-boundaries

"""This requirement ensures that asynchronous applications and 
servers can conspire to reduce the number of threads that are 
required to run a given number of application instances simultaneously."""

Notice that the only way this sentence works is if you are 
interleaving applications.

That being said, the PEP really needs an explicit discussion of the 
execution model.


From pje at telecommunity.com  Thu Oct  4 18:28:56 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 04 Oct 2007 12:28:56 -0400
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <47050FB0.6030202@libero.it>
References: <4703ADE1.5040507@libero.it>
	<20071003192817.3014C3A407A@sparrow.telecommunity.com>
	<4703F2E1.9050402@libero.it>
	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>
	<4704AAE4.1010708@libero.it>
	<20071004115207.65C463A407A@sparrow.telecommunity.com>
	<4704F040.10105@libero.it>
	<20071004143521.58AE53A407A@sparrow.telecommunity.com>
	<47050013.3070009@libero.it>
	<20071004155229.3D2303A407B@sparrow.telecommunity.com>
	<47050FB0.6030202@libero.it>
Message-ID: <20071004162618.433C83A407A@sparrow.telecommunity.com>

At 06:07 PM 10/4/2007 +0200, Manlio Perillo wrote:
>For nginx mod_wsgi I'm planning to add support to blocking
>application,executing them in a thread (*but* there will be only one
>thread per process, and the entire result will be buffered).
>
>Threaded execution will be disabled by default, and can be enabled using
>an option.
>
>To add support to asynchronous WSGI application, I will try to implement
>the pause_output extension and, more important, I will expose the nginx
>event API to the WSGI application, writing an extension module.
>
>The API will be low level, but once this API will be implemented, it
>should be possibile to implement a common and standardized API, that
>will works with nginx mod_wsgi and Twisted.

Will this API support async database connections?  Async HTTP client 
operations?  If not, then all it would be good for is waiting for the 
HTTP input stream.  And if so, then what's the point?


From chrism at plope.com  Thu Oct  4 17:55:44 2007
From: chrism at plope.com (Chris McDonough)
Date: Thu, 4 Oct 2007 11:55:44 -0400
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <20071004155229.3D2303A407B@sparrow.telecommunity.com>
References: <4703ADE1.5040507@libero.it>
	<20071003192817.3014C3A407A@sparrow.telecommunity.com>
	<4703F2E1.9050402@libero.it>
	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>
	<4704AAE4.1010708@libero.it>
	<20071004115207.65C463A407A@sparrow.telecommunity.com>
	<4704F040.10105@libero.it>
	<20071004143521.58AE53A407A@sparrow.telecommunity.com>
	<47050013.3070009@libero.it>
	<20071004155229.3D2303A407B@sparrow.telecommunity.com>
Message-ID: <15F39F0E-5D35-4D0C-AD2A-2B7AAEA35A98@plope.com>


On Oct 4, 2007, at 11:55 AM, Phillip J. Eby wrote:

> At 05:00 PM 10/4/2007 +0200, Manlio Perillo wrote:
>> Your are making a critical decision here.
>> You are lowering the level of WSGI to match the level of average WSGI
>> middlewares programmers.
>
> No, we're just getting rid of legacy cruft that's hard to support
> correctly.  There's a big difference.

Getting the start_response dance down and understanding how it plays  
with middleware is *hard*.  Even if we called it something other than  
WSGI 2.0 (which I don't think we should, because it really is an  
evolution), returning the three-tuple is the right thing to do.

- C


From manlio_perillo at libero.it  Thu Oct  4 18:37:04 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Thu, 04 Oct 2007 18:37:04 +0200
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <20071004161810.060183A407A@sparrow.telecommunity.com>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
Message-ID: <470516B0.9010605@libero.it>

Phillip J. Eby ha scritto:
> [...]
>>  and existing
>> WSGI implementations does not interleave the iteration, as far as I know.
> 
> Nothing in the spec stops them from doing so - indeed, they're 
> *encouraged* to do so:
> 
> http://www.python.org/dev/peps/pep-0333/#middleware-handling-of-block-boundaries 
> 
> 
> """This requirement ensures that asynchronous applications and servers 
> can conspire to reduce the number of threads that are required to run a 
> given number of application instances simultaneously."""
> 
> Notice that the only way this sentence works is if you are interleaving 
> applications.
> 

What "normal" HTTP servers do is to "pause" the iteration, until the 
entire buffer has been sent to the client.

They can do this, since they run in a dedicated thread or process.

With nginx this is not true.
nginx mod_wsgi will pause the iteration associated with a given request, 
but will start a new iteration as soon as a new request arrives, and 
this in the *same* thread.

To make an example (not tested), suppose that a WSGI application keeps a 
global counter (as a thread specific data).

When a new request arrives, the counter is reset to 0, and its value is 
incremented for every iteration.

With all the existing WSGI implementation (as far as I know), we always 
know the current value of the counter: it will start at 0, reach the 
number of iterations, and then will start at 0 again.

With nginx mod_wsgi this is not true, when a WSGI application set the 
counter value to, say, 6, and a new request arrives, the application 
associated with the previous request will now see the value 0, not 6, 
when it will be unpaused.

> That being said, the PEP really needs an explicit discussion of the 
> execution model.


Regards   Manlio Perillo

From pje at telecommunity.com  Thu Oct  4 18:56:12 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 04 Oct 2007 12:56:12 -0400
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <470516B0.9010605@libero.it>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<470516B0.9010605@libero.it>
Message-ID: <20071004165333.4D5353A407A@sparrow.telecommunity.com>

At 06:37 PM 10/4/2007 +0200, Manlio Perillo wrote:
>To make an example (not tested), suppose that a WSGI application keeps a
>global counter (as a thread specific data).
>
>When a new request arrives, the counter is reset to 0, and its value is
>incremented for every iteration.
>
>With all the existing WSGI implementation (as far as I know), we always
>know the current value of the counter: it will start at 0, reach the
>number of iterations, and then will start at 0 again.

So?  An application that does this is obviously broken.  Again, 
remember that the WSGI spec encourages interleaving, so any 
multi-threaded server is well within its  rights to do the same thing.

There is nothing in WSGI that says multiple simultaneous requests 
cannot be run in the same thread.  Therefore, nothing is guaranteed 
about what happens to global or thread-local resources while the 
application (or its returned iterable) is not actually executing.


From manlio_perillo at libero.it  Thu Oct  4 18:55:45 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Thu, 04 Oct 2007 18:55:45 +0200
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <20071004162618.433C83A407A@sparrow.telecommunity.com>
References: <4703ADE1.5040507@libero.it>
	<20071003192817.3014C3A407A@sparrow.telecommunity.com>
	<4703F2E1.9050402@libero.it>
	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>
	<4704AAE4.1010708@libero.it>
	<20071004115207.65C463A407A@sparrow.telecommunity.com>
	<4704F040.10105@libero.it>
	<20071004143521.58AE53A407A@sparrow.telecommunity.com>
	<47050013.3070009@libero.it>
	<20071004155229.3D2303A407B@sparrow.telecommunity.com>
	<47050FB0.6030202@libero.it>
	<20071004162618.433C83A407A@sparrow.telecommunity.com>
Message-ID: <47051B11.2020204@libero.it>

Phillip J. Eby ha scritto:
> At 06:07 PM 10/4/2007 +0200, Manlio Perillo wrote:
>> For nginx mod_wsgi I'm planning to add support to blocking
>> application,executing them in a thread (*but* there will be only one
>> thread per process, and the entire result will be buffered).
>>
>> Threaded execution will be disabled by default, and can be enabled using
>> an option.
>>
>> To add support to asynchronous WSGI application, I will try to implement
>> the pause_output extension and, more important, I will expose the nginx
>> event API to the WSGI application, writing an extension module.
>>
>> The API will be low level, but once this API will be implemented, it
>> should be possibile to implement a common and standardized API, that
>> will works with nginx mod_wsgi and Twisted.
> 
> Will this API support async database connections?  

No.
Async database connections can be implemented using this API.

Using this API we can, as an example, use the asynchronous API already 
implemented by psycopg2 (but not tested, since no one seems to be 
interested):


import psycopg2
import ngx_reactor


def handler(event):
    if cursor.isready():
       resume()

conn = psycopg2.connect(database='test')
curs = conn.cursor()
fileno = curs.fileno()

event = ngx_reactor.create_event(fileno, handler, ...)
ngx_reactor.add_event(event, NGX_READ_EVENT)

resume = environ['wsgi.pause_output']()
curs.execute("SELECT * from sleep(%s, 1)", (delay,), async=1)
yield ''

# Now we have the full response, and we can proceed as in a synchronous 
# application


The real problem here, is the fact that we can not execute new queries 
until the current query terminates, so we need to implement a query queue.

Another big problem is when we want to use a transaction, since we need 
to execute more then one query.


> Async HTTP client 
> operations?  

Again, this is will be a low level API.

However I think that it should be possible to write an "emulation" of a 
Twisted reactor, so we can use the protocols implemented in Twisted (but 
this is a *big* challenge, and I'm not really interested, since if I 
need to use Twisted protocols, then I will use Twisted Web).

> If not, then all it would be good for is waiting for the 
> HTTP input stream.  

The current implementation of nginx mod_wsgi already waits until the 
full request body has been read by Nginx (and the input stream object is 
an instance of cStringIO or File object, depending on the size of the 
request body and the value of the client_body_buffer_size option).

Nginx does not yet implements input filters.

> And if so, then what's the point?


Regards  Manlio Perillo

From manlio_perillo at libero.it  Thu Oct  4 18:58:50 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Thu, 04 Oct 2007 18:58:50 +0200
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <20071004165333.4D5353A407A@sparrow.telecommunity.com>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<470516B0.9010605@libero.it>
	<20071004165333.4D5353A407A@sparrow.telecommunity.com>
Message-ID: <47051BCA.7090709@libero.it>

Phillip J. Eby ha scritto:
> At 06:37 PM 10/4/2007 +0200, Manlio Perillo wrote:
>> To make an example (not tested), suppose that a WSGI application keeps a
>> global counter (as a thread specific data).
>>
>> When a new request arrives, the counter is reset to 0, and its value is
>> incremented for every iteration.
>>
>> With all the existing WSGI implementation (as far as I know), we always
>> know the current value of the counter: it will start at 0, reach the
>> number of iterations, and then will start at 0 again.
> 
> So?  An application that does this is obviously broken.  Again, remember 
> that the WSGI spec encourages interleaving, so any multi-threaded server 
> is well within its  rights to do the same thing.
> 
> There is nothing in WSGI that says multiple simultaneous requests cannot 
> be run in the same thread.  Therefore, nothing is guaranteed about what 
> happens to global or thread-local resources while the application (or 
> its returned iterable) is not actually executing.

Ok.
But why you are against adding a new environ value (not necessary named 
wsgi.asynchronous), that will explicitly state if the WSGI server will 
interleave the WSGI application?


Regards  Manlio Perillo

From pje at telecommunity.com  Thu Oct  4 19:47:52 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 04 Oct 2007 13:47:52 -0400
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <47051BCA.7090709@libero.it>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<470516B0.9010605@libero.it>
	<20071004165333.4D5353A407A@sparrow.telecommunity.com>
	<47051BCA.7090709@libero.it>
Message-ID: <20071004174513.A4F0F3A407A@sparrow.telecommunity.com>

At 06:58 PM 10/4/2007 +0200, Manlio Perillo wrote:
>But why you are against adding a new environ value (not necessary named
>wsgi.asynchronous), that will explicitly state if the WSGI server will
>interleave the WSGI application?

Why do you think it's useful?


From manlio_perillo at libero.it  Thu Oct  4 19:53:45 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Thu, 04 Oct 2007 19:53:45 +0200
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <20071004174513.A4F0F3A407A@sparrow.telecommunity.com>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<470516B0.9010605@libero.it>
	<20071004165333.4D5353A407A@sparrow.telecommunity.com>
	<47051BCA.7090709@libero.it>
	<20071004174513.A4F0F3A407A@sparrow.telecommunity.com>
Message-ID: <470528A9.3050108@libero.it>

Phillip J. Eby ha scritto:
> At 06:58 PM 10/4/2007 +0200, Manlio Perillo wrote:
>> But why you are against adding a new environ value (not necessary named
>> wsgi.asynchronous), that will explicitly state if the WSGI server will
>> interleave the WSGI application?
> 
> Why do you think it's useful?

For the same reason you think wsgi.multiprocess is useful.
Its an informative information; maybe it is not really useful, but it 
describe how the WSGI server works.


Regards  Manlio Perillo

From MDiPierro at cti.depaul.edu  Thu Oct  4 20:29:01 2007
From: MDiPierro at cti.depaul.edu (DiPierro, Massimo)
Date: Thu, 4 Oct 2007 13:29:01 -0500
Subject: [Web-SIG] NOOO! Another web framework
Message-ID: <C1AA20CAC1E41647BFF1D41215EDD475179ED56A25@wagner.cti.depaul.edu>

hello everybody...

please do not shoot me! I know you don't think you need a new web framework but please give me the benefit of the doubt (I teach a class on Web Frameworks at DePaul University):

http://mdp.cti.depaul.edu/examples

Why?
here are some unique features:
1) full web based development, deployment and management of applications, no more shell commands (unless you want them)
2) built-in ticketing system to report bugs to administrator (not to the users, ever)
3) can compile applciations to byte-code for speed and distribution in closed source (some people want this)
4) 100% python (including template language).
5) no installation or configuration required. Just download and click. (includes python, web server, sqlite3, administrative interface and examples)
6) everything has a default: you write the model, you get an administrative interface; you write a controller, you get a generic view; etc.
7) The API are stable and there is no plan for a change.

It shares with Django and Turbogears some features: model-view-controller design, form generators and validation, internationalization, ORM, although all code has been written from scratch.

Here is an example application, a CMS to manage groups (members, wikis, blogs, votes, minutes, documents):

https://mdp.cti.depaul.edu/groups

Massimo


From graham.dumpleton at gmail.com  Fri Oct  5 01:03:40 2007
From: graham.dumpleton at gmail.com (Graham Dumpleton)
Date: Fri, 5 Oct 2007 09:03:40 +1000
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <20071004130818.BFCE83A407A@sparrow.telecommunity.com>
References: <4703ADE1.5040507@libero.it>
	<20071003192817.3014C3A407A@sparrow.telecommunity.com>
	<4703F2E1.9050402@libero.it>
	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>
	<88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com>
	<20071004114441.C7B103A407A@sparrow.telecommunity.com>
	<88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com>
	<20071004130818.BFCE83A407A@sparrow.telecommunity.com>
Message-ID: <88e286470710041603p309e8313pe0279342088894bf@mail.gmail.com>

On 04/10/2007, Phillip J. Eby <pje at telecommunity.com> wrote:
> >But once you have called start_response() you cant call it a second
> >time to change the values
>
> You can, as long as you pass in the exception info -- because an
> exception is the only legitimate reason to change the values.

Okay, forgot about that case. Luckily my code appears in the main to
do the correct thing, although I'll need to check a few corner cases
as looks like the traceback I log when start_response() called with
exception after data written is a bit wrong as doesn't identify the
original exception type correctly. This may just be an issue with how
I log exception details from C API. The yielding of empty strings
prior to calling start_response() with exception details also gives me
strife as appear not to return any response to client at all, ie., no
headers or body. So, little bit of tweaking to do.

> > > Only problem is that the PEP examples and wsgiref aren't written to
> > > support doing it that way, so I don't think we can reasonably change
> > > it in WSGI 1.0, and in 2.0 it won't even matter.
> >
> >Huh, change what in WSGI 1.0. As you seem to note the CGI example in
> >the PEP does flush headers even if first data block was an empty
> >string
>
> Actually, the PEP example skips empty strings yielded by the
> app_iter.  wsgiref.handlers, OTOH, doesn't do this, now that I've checked it.

True again. I was only looking at the internals of write() and so
missed that iteration would eliminate empty strings.

> Yep, but another argument in favor of getting rid of as much
> statefulness from the protocol as we can.  Making the status and
> headers part of the return value entirely eliminates the question of
> when they're going to get written, and whether they can be changed.
>
> (As a side benefit, making the return a 3-tuple makes it impossible
> to write a WSGI app using a single generator -- thereby discouraging
> people from using 'yield' like it was a CGI "print".)

Too early for me to be thinking straight and work it out for myself,
but does this all help in making it simpler or more obvious what the
cleanup requirements are for a generator. Ie., correct use of
try/except/finally around yield and purpose of close() function. I've
seen a number of people not get this correct in stuff and tried to
correct them. Hopefully I have captured what should be done correctly
in my document:

  http://code.google.com/p/modwsgi/wiki/RegisteringCleanupCode

If I haven't please let me know. :-)

Graham

From pje at telecommunity.com  Fri Oct  5 02:22:24 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 04 Oct 2007 20:22:24 -0400
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <88e286470710041603p309e8313pe0279342088894bf@mail.gmail.co
 m>
References: <4703ADE1.5040507@libero.it>
	<20071003192817.3014C3A407A@sparrow.telecommunity.com>
	<4703F2E1.9050402@libero.it>
	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>
	<88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com>
	<20071004114441.C7B103A407A@sparrow.telecommunity.com>
	<88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com>
	<20071004130818.BFCE83A407A@sparrow.telecommunity.com>
	<88e286470710041603p309e8313pe0279342088894bf@mail.gmail.com>
Message-ID: <20071005001945.39CD13A407A@sparrow.telecommunity.com>

At 09:03 AM 10/5/2007 +1000, Graham Dumpleton wrote:
>Too early for me to be thinking straight and work it out for myself,
>but does this all help in making it simpler or more obvious what the
>cleanup requirements are for a generator. Ie., correct use of
>try/except/finally around yield and purpose of close() function. I've
>seen a number of people not get this correct in stuff and tried to
>correct them. Hopefully I have captured what should be done correctly
>in my document:
>
>   http://code.google.com/p/modwsgi/wiki/RegisteringCleanupCode

That's fine, and none of it would change for WSGI 2.0, except minor 
details of what wraps what.

Note, by the way, that as of Python 2.5, a generator can have 
try/finally and its close() method will be called when it finishes or 
is garbage collected.  So an app_iter implemented as a generator 
under 2.5 can just use with: or try/finally to handle cleanup -- and 
that applies equally to WSGI 1 and 2.


From pje at telecommunity.com  Fri Oct  5 02:27:02 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu, 04 Oct 2007 20:27:02 -0400
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <470528A9.3050108@libero.it>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<470516B0.9010605@libero.it>
	<20071004165333.4D5353A407A@sparrow.telecommunity.com>
	<47051BCA.7090709@libero.it>
	<20071004174513.A4F0F3A407A@sparrow.telecommunity.com>
	<470528A9.3050108@libero.it>
Message-ID: <20071005002423.320413A407A@sparrow.telecommunity.com>

At 07:53 PM 10/4/2007 +0200, Manlio Perillo wrote:
>Phillip J. Eby ha scritto:
> > At 06:58 PM 10/4/2007 +0200, Manlio Perillo wrote:
> >> But why you are against adding a new environ value (not necessary named
> >> wsgi.asynchronous), that will explicitly state if the WSGI server will
> >> interleave the WSGI application?
> >
> > Why do you think it's useful?
>
>For the same reason you think wsgi.multiprocess is useful.

Actually, I don't think it's all that useful; IIRC, it was added as a 
compromise to the spec, to fend off a proposal for a more complex 
server-capabilities API.  :)

Also, there's an important difference between your proposed addition 
and the multiprocess/multithread flags, which is that there existed 
frameworks that could be ported to WSGI that only supported one model 
or the other.  I.e., frameworks that could only run multi-threaded, 
or only multi-process.

In other words, those flags were to support legacy frameworks 
detecting that they were in an incompatible hosting 
environment.  However, IIUC, there is no such existing framework that 
could meaningfully use the flag you're proposing, that has any real 
chance of being portable to different WSGI environments.


From graham.dumpleton at gmail.com  Fri Oct  5 02:31:13 2007
From: graham.dumpleton at gmail.com (Graham Dumpleton)
Date: Fri, 5 Oct 2007 10:31:13 +1000
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <20071005001945.39CD13A407A@sparrow.telecommunity.com>
References: <4703ADE1.5040507@libero.it>
	<20071003192817.3014C3A407A@sparrow.telecommunity.com>
	<4703F2E1.9050402@libero.it>
	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>
	<88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com>
	<20071004114441.C7B103A407A@sparrow.telecommunity.com>
	<88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com>
	<20071004130818.BFCE83A407A@sparrow.telecommunity.com>
	<88e286470710041603p309e8313pe0279342088894bf@mail.gmail.com>
	<20071005001945.39CD13A407A@sparrow.telecommunity.com>
Message-ID: <88e286470710041731n2d0c115fr145f27adf2a55000@mail.gmail.com>

On 05/10/2007, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 09:03 AM 10/5/2007 +1000, Graham Dumpleton wrote:
> >Too early for me to be thinking straight and work it out for myself,
> >but does this all help in making it simpler or more obvious what the
> >cleanup requirements are for a generator. Ie., correct use of
> >try/except/finally around yield and purpose of close() function. I've
> >seen a number of people not get this correct in stuff and tried to
> >correct them. Hopefully I have captured what should be done correctly
> >in my document:
> >
> >   http://code.google.com/p/modwsgi/wiki/RegisteringCleanupCode
>
> That's fine, and none of it would change for WSGI 2.0, except minor
> details of what wraps what.
>
> Note, by the way, that as of Python 2.5, a generator can have
> try/finally and its close() method will be called when it finishes or
> is garbage collected.  So an app_iter implemented as a generator
> under 2.5 can just use with: or try/finally to handle cleanup -- and
> that applies equally to WSGI 1 and 2.

Yep, know about the Python 2.5 difference. Didn't want to talk about
it though so that people would just use the way that would also work
with older versions of Python.

BTW, have been thinking about doing it for a long time, but truly
wasn't sure that WSGI 2.0 would ever actually happen, but now that
discussion is happening again I will add to Apache mod_wsgi a
directive WSGIProtocolVersion which would allow experimental 2.0
implementation to be switched on for specific applications. Having
this and perhaps other experimental implementations may help to flush
out any issues when we start discussing details, especially as Apache
imposes its own quirks that others tend not to have to deal with.
Adding this support in should be quite trivial. Once that is done and
the discussion about asynchronous implementations dies down, might
initiate discussions about some other issues such as wsgi.input, end
of input indicators and content length issues for streamed request
content and mutating input filters etc.

Graham

From graham.dumpleton at gmail.com  Fri Oct  5 02:41:09 2007
From: graham.dumpleton at gmail.com (Graham Dumpleton)
Date: Fri, 5 Oct 2007 10:41:09 +1000
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <20071005002423.320413A407A@sparrow.telecommunity.com>
References: <4704222D.30208@colorstudy.com>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<470516B0.9010605@libero.it>
	<20071004165333.4D5353A407A@sparrow.telecommunity.com>
	<47051BCA.7090709@libero.it>
	<20071004174513.A4F0F3A407A@sparrow.telecommunity.com>
	<470528A9.3050108@libero.it>
	<20071005002423.320413A407A@sparrow.telecommunity.com>
Message-ID: <88e286470710041741k55bdf059p95f0229bfb36c262@mail.gmail.com>

On 05/10/2007, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 07:53 PM 10/4/2007 +0200, Manlio Perillo wrote:
> >Phillip J. Eby ha scritto:
> > > At 06:58 PM 10/4/2007 +0200, Manlio Perillo wrote:
> > >> But why you are against adding a new environ value (not necessary named
> > >> wsgi.asynchronous), that will explicitly state if the WSGI server will
> > >> interleave the WSGI application?
> > >
> > > Why do you think it's useful?
> >
> >For the same reason you think wsgi.multiprocess is useful.
>
> Actually, I don't think it's all that useful; IIRC, it was added as a
> compromise to the spec, to fend off a proposal for a more complex
> server-capabilities API.  :)
>
> Also, there's an important difference between your proposed addition
> and the multiprocess/multithread flags, which is that there existed
> frameworks that could be ported to WSGI that only supported one model
> or the other.  I.e., frameworks that could only run multi-threaded,
> or only multi-process.

FWIW, one example where the flags are useful is in WSGI components
such as browser based debuggers such as EvalException as they could
disable themselves or flag an error when used in a multiprocess web
server where there would be no guarantee that a subsequent request
would end up back at the same process.

Graham

From ianb at colorstudy.com  Fri Oct  5 02:43:32 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu, 04 Oct 2007 20:43:32 -0400
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <88e286470710041741k55bdf059p95f0229bfb36c262@mail.gmail.com>
References: <4704222D.30208@colorstudy.com>	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>	<47050CC7.9030500@libero.it>	<20071004161810.060183A407A@sparrow.telecommunity.com>	<470516B0.9010605@libero.it>	<20071004165333.4D5353A407A@sparrow.telecommunity.com>	<47051BCA.7090709@libero.it>	<20071004174513.A4F0F3A407A@sparrow.telecommunity.com>	<470528A9.3050108@libero.it>	<20071005002423.320413A407A@sparrow.telecommunity.com>
	<88e286470710041741k55bdf059p95f0229bfb36c262@mail.gmail.com>
Message-ID: <470588B4.3060108@colorstudy.com>

Graham Dumpleton wrote:
> On 05/10/2007, Phillip J. Eby <pje at telecommunity.com> wrote:
>> At 07:53 PM 10/4/2007 +0200, Manlio Perillo wrote:
>>> Phillip J. Eby ha scritto:
>>>> At 06:58 PM 10/4/2007 +0200, Manlio Perillo wrote:
>>>>> But why you are against adding a new environ value (not necessary named
>>>>> wsgi.asynchronous), that will explicitly state if the WSGI server will
>>>>> interleave the WSGI application?
>>>> Why do you think it's useful?
>>> For the same reason you think wsgi.multiprocess is useful.
>> Actually, I don't think it's all that useful; IIRC, it was added as a
>> compromise to the spec, to fend off a proposal for a more complex
>> server-capabilities API.  :)
>>
>> Also, there's an important difference between your proposed addition
>> and the multiprocess/multithread flags, which is that there existed
>> frameworks that could be ported to WSGI that only supported one model
>> or the other.  I.e., frameworks that could only run multi-threaded,
>> or only multi-process.
> 
> FWIW, one example where the flags are useful is in WSGI components
> such as browser based debuggers such as EvalException as they could
> disable themselves or flag an error when used in a multiprocess web
> server where there would be no guarantee that a subsequent request
> would end up back at the same process.

Yeah, there's several things I pushed for in WSGI that I didn't really 
end up needing or wanting later.  But wsgi.multiprocess and 
wsgi.multithread have been useful; certainly enough to warrant their 
simplicity.

   Ian


From chris at simplistix.co.uk  Fri Oct  5 09:04:59 2007
From: chris at simplistix.co.uk (Chris Withers)
Date: Fri, 05 Oct 2007 08:04:59 +0100
Subject: [Web-SIG] NOOO! Another web framework
In-Reply-To: <C1AA20CAC1E41647BFF1D41215EDD475179ED56A25@wagner.cti.depaul.edu>
References: <C1AA20CAC1E41647BFF1D41215EDD475179ED56A25@wagner.cti.depaul.edu>
Message-ID: <4705E21B.4050902@simplistix.co.uk>

DiPierro, Massimo wrote:
> here are some unique features:
> 1) full web based development, deployment and management of applications, no more shell commands (unless you want them)

Good. Zope seems to have moved away from this, which is a shame...

> 2) built-in ticketing system to report bugs to administrator (not to the users, ever)

Nice :-)
(although the users do see some kid of page saying "sorry, something 
went wrong, right?)

> 3) can compile applciations to byte-code for speed and distribution in closed source (some people want this)

You do know it takes about 2 minutes to turn a .pyc back into a .py, right?

> 5) no installation or configuration required. Just download and click. (includes python, web server, sqlite3, administrative interface and examples)

Cool, although you will need to cater for proper deployments if things 
go well...

cheers,

Chris

-- 
Simplistix - Content Management, Zope & Python Consulting
            - http://www.simplistix.co.uk

From manlio_perillo at libero.it  Fri Oct  5 12:36:32 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Fri, 05 Oct 2007 12:36:32 +0200
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <20071004130818.BFCE83A407A@sparrow.telecommunity.com>
References: <4703ADE1.5040507@libero.it>
	<20071003192817.3014C3A407A@sparrow.telecommunity.com>
	<4703F2E1.9050402@libero.it>
	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>
	<88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com>
	<20071004114441.C7B103A407A@sparrow.telecommunity.com>
	<88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com>
	<20071004130818.BFCE83A407A@sparrow.telecommunity.com>
Message-ID: <470613B0.8000101@libero.it>

Phillip J. Eby ha scritto:
> [...]
> Yep, but another argument in favor of getting rid of as much 
> statefulness from the protocol as we can.  Making the status and headers 
> part of the return value entirely eliminates the question of when 
> they're going to get written, and whether they can be changed.
> 
> (As a side benefit, making the return a 3-tuple makes it impossible to 
> write a WSGI app using a single generator -- thereby discouraging people 
> from using 'yield' like it was a CGI "print".)
> 


Wait, what do you mean by """As a side benefit, making the return a 
3-tuple makes it impossible to write a WSGI app using a single generator"""?

And what do you mean by """getting rid of as much
statefulness from the protocol as we can"""?


Regards  Manlio Perillo

From manlio_perillo at libero.it  Fri Oct  5 12:41:14 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Fri, 05 Oct 2007 12:41:14 +0200
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <20071005002423.320413A407A@sparrow.telecommunity.com>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<470516B0.9010605@libero.it>
	<20071004165333.4D5353A407A@sparrow.telecommunity.com>
	<47051BCA.7090709@libero.it>
	<20071004174513.A4F0F3A407A@sparrow.telecommunity.com>
	<470528A9.3050108@libero.it>
	<20071005002423.320413A407A@sparrow.telecommunity.com>
Message-ID: <470614CA.8000300@libero.it>

Phillip J. Eby ha scritto:
> At 07:53 PM 10/4/2007 +0200, Manlio Perillo wrote:
>> Phillip J. Eby ha scritto:
>> > At 06:58 PM 10/4/2007 +0200, Manlio Perillo wrote:
>> >> But why you are against adding a new environ value (not necessary 
>> named
>> >> wsgi.asynchronous), that will explicitly state if the WSGI server will
>> >> interleave the WSGI application?
>> >
>> > Why do you think it's useful?
>>
>> For the same reason you think wsgi.multiprocess is useful.
> 
> Actually, I don't think it's all that useful; IIRC, it was added as a 
> compromise to the spec, to fend off a proposal for a more complex 
> server-capabilities API.  :)
> 

Ok.

> Also, there's an important difference between your proposed addition and 
> the multiprocess/multithread flags, which is that there existed 
> frameworks that could be ported to WSGI that only supported one model or 
> the other.  I.e., frameworks that could only run multi-threaded, or only 
> multi-process.
> 
> In other words, those flags were to support legacy frameworks detecting 
> that they were in an incompatible hosting environment.  However, IIUC, 
> there is no such existing framework that could meaningfully use the flag 
> you're proposing, that has any real chance of being portable to 
> different WSGI environments.


This is true, but I continue to think that it is worth adding that flag.
Asynchronous support is available in Nginx mod_wsgi, and in the future 
someone can implement a WSGI gateway for lighttpd.


Regards  Manlio Perillo

From pje at telecommunity.com  Fri Oct  5 16:33:39 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 05 Oct 2007 10:33:39 -0400
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <470613B0.8000101@libero.it>
References: <4703ADE1.5040507@libero.it>
	<20071003192817.3014C3A407A@sparrow.telecommunity.com>
	<4703F2E1.9050402@libero.it>
	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>
	<88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com>
	<20071004114441.C7B103A407A@sparrow.telecommunity.com>
	<88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com>
	<20071004130818.BFCE83A407A@sparrow.telecommunity.com>
	<470613B0.8000101@libero.it>
Message-ID: <20071005143100.07AD63A407C@sparrow.telecommunity.com>

At 12:36 PM 10/5/2007 +0200, Manlio Perillo wrote:
>Phillip J. Eby ha scritto:
> > [...]
> > Yep, but another argument in favor of getting rid of as much
> > statefulness from the protocol as we can.  Making the status and headers
> > part of the return value entirely eliminates the question of when
> > they're going to get written, and whether they can be changed.
> >
> > (As a side benefit, making the return a 3-tuple makes it impossible to
> > write a WSGI app using a single generator -- thereby discouraging people
> > from using 'yield' like it was a CGI "print".)
> >
>
>
>Wait, what do you mean by """As a side benefit, making the return a
>3-tuple makes it impossible to write a WSGI app using a single generator"""?

I mean that you can't write a WSGI 2.0 application using a single 
generator function, because it has to return a tuple, not an 
iterator.  This will discourage people from thinking "yield" is a 
good way to build up their output, instead of using a StringIO or 
''.join() on a list of strings.


>And what do you mean by """getting rid of as much
>statefulness from the protocol as we can"""?

Most of WSGI 1.0's complexity comes from the sequence of operations - 
when you call start_response(), whether you can call it again, 
whether iteration is in progress, etc.  WSGI 2.0 gives all the 
sequence control to the caller, so that there is no delicate dance of 
calls back and forth.  This especially simplifies middleware that 
manipulates the output stream, because it doesn't need to wrap 
start_response() and write().


From pje at telecommunity.com  Fri Oct  5 16:36:36 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 05 Oct 2007 10:36:36 -0400
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <470614CA.8000300@libero.it>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<470516B0.9010605@libero.it>
	<20071004165333.4D5353A407A@sparrow.telecommunity.com>
	<47051BCA.7090709@libero.it>
	<20071004174513.A4F0F3A407A@sparrow.telecommunity.com>
	<470528A9.3050108@libero.it>
	<20071005002423.320413A407A@sparrow.telecommunity.com>
	<470614CA.8000300@libero.it>
Message-ID: <20071005143356.B8B7D3A407C@sparrow.telecommunity.com>

At 12:41 PM 10/5/2007 +0200, Manlio Perillo wrote:
>Phillip J. Eby ha scritto:
> > In other words, those flags were to support legacy frameworks detecting
> > that they were in an incompatible hosting environment.  However, IIUC,
> > there is no such existing framework that could meaningfully use the flag
> > you're proposing, that has any real chance of being portable to
> > different WSGI environments.
>
>This is true, but I continue to think that it is worth adding that flag.
>Asynchronous support is available in Nginx mod_wsgi, and in the future
>someone can implement a WSGI gateway for lighttpd.

Right now, the definition of the flag is not sufficiently defined for 
my taste.  You have only proposed that it be set to indicate that 
interleaved execution is possible -- but it is *always* possible to 
have interleaved execution in WSGI 1.0, so the only reason to add the 
flag to WSGI 2.0 would be so a server could promise NOT to interleave 
execution.  And what good is that?


From roberto at dealmeida.net  Fri Oct  5 16:57:43 2007
From: roberto at dealmeida.net (Rob De Almeida)
Date: Fri, 05 Oct 2007 11:57:43 -0300
Subject: [Web-SIG] yield considered harmful (was: x-wsgiorg.flush)
In-Reply-To: <20071005143100.07AD63A407C@sparrow.telecommunity.com>
References: <4703ADE1.5040507@libero.it>	<20071003192817.3014C3A407A@sparrow.telecommunity.com>	<4703F2E1.9050402@libero.it>	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>	<88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com>	<20071004114441.C7B103A407A@sparrow.telecommunity.com>	<88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com>	<20071004130818.BFCE83A407A@sparrow.telecommunity.com>	<470613B0.8000101@libero.it>
	<20071005143100.07AD63A407C@sparrow.telecommunity.com>
Message-ID: <470650E7.4050809@dealmeida.net>

Phillip J. Eby wrote:
> I mean that you can't write a WSGI 2.0 application using a single 
> generator function, because it has to return a tuple, not an 
> iterator.  This will discourage people from thinking "yield" is a 
> good way to build up their output, instead of using a StringIO or 
> ''.join() on a list of strings.

Could you explain why using 'yield' is not recommended? Just curious, 
because I use it all the time.

Thanks,
--Rob

From manlio_perillo at libero.it  Fri Oct  5 17:14:02 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Fri, 05 Oct 2007 17:14:02 +0200
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <20071005143356.B8B7D3A407C@sparrow.telecommunity.com>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<470516B0.9010605@libero.it>
	<20071004165333.4D5353A407A@sparrow.telecommunity.com>
	<47051BCA.7090709@libero.it>
	<20071004174513.A4F0F3A407A@sparrow.telecommunity.com>
	<470528A9.3050108@libero.it>
	<20071005002423.320413A407A@sparrow.telecommunity.com>
	<470614CA.8000300@libero.it>
	<20071005143356.B8B7D3A407C@sparrow.telecommunity.com>
Message-ID: <470654BA.9050100@libero.it>

Phillip J. Eby ha scritto:
> At 12:41 PM 10/5/2007 +0200, Manlio Perillo wrote:
>> Phillip J. Eby ha scritto:
>> > In other words, those flags were to support legacy frameworks detecting
>> > that they were in an incompatible hosting environment.  However, IIUC,
>> > there is no such existing framework that could meaningfully use the 
>> flag
>> > you're proposing, that has any real chance of being portable to
>> > different WSGI environments.
>>
>> This is true, but I continue to think that it is worth adding that flag.
>> Asynchronous support is available in Nginx mod_wsgi, and in the future
>> someone can implement a WSGI gateway for lighttpd.
> 
> Right now, the definition of the flag is not sufficiently defined for my 
> taste.  You have only proposed that it be set to indicate that 
> interleaved execution is possible -- but it is *always* possible to have 
> interleaved execution in WSGI 1.0, so the only reason to add the flag to 
> WSGI 2.0 would be so a server could promise NOT to interleave 
> execution.  And what good is that?
> 

Ok, here is more useful definition.

If wsgi.asynchronous evaluates to true, then the WSGI application *will* 
be executed into the server main process cycle and thus the application 
execution *will* be interleaved (since this is the only way to support 
multiple concurrent requests).


Regards  Manlio Perillo

From ianb at colorstudy.com  Fri Oct  5 17:16:10 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Fri, 05 Oct 2007 11:16:10 -0400
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <470654BA.9050100@libero.it>
References: <4704222D.30208@colorstudy.com>
	<4704EEDA.1010800@libero.it>	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>	<4704FD32.9020604@libero.it>	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>	<47050CC7.9030500@libero.it>	<20071004161810.060183A407A@sparrow.telecommunity.com>	<470516B0.9010605@libero.it>	<20071004165333.4D5353A407A@sparrow.telecommunity.com>	<47051BCA.7090709@libero.it>	<20071004174513.A4F0F3A407A@sparrow.telecommunity.com>	<470528A9.3050108@libero.it>	<20071005002423.320413A407A@sparrow.telecommunity.com>	<470614CA.8000300@libero.it>	<20071005143356.B8B7D3A407C@sparrow.telecommunity.com>
	<470654BA.9050100@libero.it>
Message-ID: <4706553A.3080603@colorstudy.com>

Manlio Perillo wrote:
> Phillip J. Eby ha scritto:
>> At 12:41 PM 10/5/2007 +0200, Manlio Perillo wrote:
>>> Phillip J. Eby ha scritto:
>>>> In other words, those flags were to support legacy frameworks detecting
>>>> that they were in an incompatible hosting environment.  However, IIUC,
>>>> there is no such existing framework that could meaningfully use the 
>>> flag
>>>> you're proposing, that has any real chance of being portable to
>>>> different WSGI environments.
>>> This is true, but I continue to think that it is worth adding that flag.
>>> Asynchronous support is available in Nginx mod_wsgi, and in the future
>>> someone can implement a WSGI gateway for lighttpd.
>> Right now, the definition of the flag is not sufficiently defined for my 
>> taste.  You have only proposed that it be set to indicate that 
>> interleaved execution is possible -- but it is *always* possible to have 
>> interleaved execution in WSGI 1.0, so the only reason to add the flag to 
>> WSGI 2.0 would be so a server could promise NOT to interleave 
>> execution.  And what good is that?
>>
> 
> Ok, here is more useful definition.
> 
> If wsgi.asynchronous evaluates to true, then the WSGI application *will* 
> be executed into the server main process cycle and thus the application 
> execution *will* be interleaved (since this is the only way to support 
> multiple concurrent requests).

Isn't the more important distinction that the application must not 
block?  Kind of like wsgi.multithread means the application must be 
threadsafe.

   Ian


From manlio_perillo at libero.it  Fri Oct  5 17:34:05 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Fri, 05 Oct 2007 17:34:05 +0200
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <4706553A.3080603@colorstudy.com>
References: <4704222D.30208@colorstudy.com>
	<4704EEDA.1010800@libero.it>	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>	<4704FD32.9020604@libero.it>	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>	<47050CC7.9030500@libero.it>	<20071004161810.060183A407A@sparrow.telecommunity.com>	<470516B0.9010605@libero.it>	<20071004165333.4D5353A407A@sparrow.telecommunity.com>	<47051BCA.7090709@libero.it>	<20071004174513.A4F0F3A407A@sparrow.telecommunity.com>	<470528A9.3050108@libero.it>	<20071005002423.320413A407A@sparrow.telecommunity.com>	<470614CA.8000300@libero.it>	<20071005143356.B8B7D3A407C@sparrow.telecommunity.com>
	<470654BA.9050100@libero.it> <4706553A.3080603@colorstudy.com>
Message-ID: <4706596D.4040000@libero.it>

Ian Bicking ha scritto:
> [...]
>> Ok, here is more useful definition.
>>
>> If wsgi.asynchronous evaluates to true, then the WSGI application 
>> *will* be executed into the server main process cycle and thus the 
>> application execution *will* be interleaved (since this is the only 
>> way to support multiple concurrent requests).
> 
> Isn't the more important distinction that the application must not 
> block?  Kind of like wsgi.multithread means the application must be 
> threadsafe.
> 

Right, but I assume that this is evident when I say "executed into the 
server main process cycle".

An interesting example is an application that will read some data from a 
source (as an example from a video capture device) and will send the 
output to the web.

The application can blocks when reading, but as soon as it will yield 
some data, the server can interleave calls to it.

This means that the WSGI application can not use a "global" handle to 
the video capture device, or use thread specific data.

It must be able to store the device handle on a per request "context".

This is the reason why I'm writing a spec for a `wsgi.context_id` 
extension, that will return a request specific identifier (in the same 
way as it is done by os.getpid or thread.get_ident)


>   Ian
> 


Regards  Manlio Perillo

From manlio_perillo at libero.it  Fri Oct  5 17:47:26 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Fri, 05 Oct 2007 17:47:26 +0200
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <4706596D.4040000@libero.it>
References: <4704222D.30208@colorstudy.com>	<4704EEDA.1010800@libero.it>	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>	<4704FD32.9020604@libero.it>	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>	<47050CC7.9030500@libero.it>	<20071004161810.060183A407A@sparrow.telecommunity.com>	<470516B0.9010605@libero.it>	<20071004165333.4D5353A407A@sparrow.telecommunity.com>	<47051BCA.7090709@libero.it>	<20071004174513.A4F0F3A407A@sparrow.telecommunity.com>	<470528A9.3050108@libero.it>	<20071005002423.320413A407A@sparrow.telecommunity.com>	<470614CA.8000300@libero.it>	<20071005143356.B8B7D3A407C@sparrow.telecommunity.com>	<470654BA.9050100@libero.it>
	<4706553A.3080603@colorstudy.com> <4706596D.4040000@libero.it>
Message-ID: <47065C8E.1010703@libero.it>

Manlio Perillo ha scritto:
> Ian Bicking ha scritto:
>> [...]
>>> Ok, here is more useful definition.
>>>
>>> If wsgi.asynchronous evaluates to true, then the WSGI application 
>>> *will* be executed into the server main process cycle and thus the 
>>> application execution *will* be interleaved (since this is the only 
>>> way to support multiple concurrent requests).
>> Isn't the more important distinction that the application must not 
>> block?  Kind of like wsgi.multithread means the application must be 
>> threadsafe.
>>
> 
> Right, but I assume that this is evident when I say "executed into the 
> server main process cycle".
> 
> An interesting example is an application that will read some data from a 
> source (as an example from a video capture device) and will send the 
> output to the web.
> 

Forget what I have written.

A request specific context is already supplied by the wsgi application 
callable context.

I'm tring to understand if an explicit request context is necessary for 
some other kind of applications.


Regards  Manlio Perillo

From robinbryce at gmail.com  Fri Oct  5 18:34:23 2007
From: robinbryce at gmail.com (Robin Bryce)
Date: Fri, 5 Oct 2007 17:34:23 +0100
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <20071004161810.060183A407A@sparrow.telecommunity.com>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
Message-ID: <bcf87d920710050934i21467943m9c578cfde404b3c@mail.gmail.com>

On 04/10/2007, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 05:54 PM 10/4/2007 +0200, Manlio Perillo wrote:
> >Phillip J. Eby ha scritto:
> > > At 04:48 PM 10/4/2007 +0200, Manlio Perillo wrote:
> > >> Phillip J. Eby ha scritto:
> > >> > It's always the case that a WSGI application can be paused after it
> > >> > yields data, even in WSGI 1.0.
> > >>
> > >> I was not aware of this.
> > >> It may cause some problems to a unaware WSGI application the fact that a
> > >> new "handler" is started "interleaved" with the previous ones.
> > >
> > > It may... but the only applications that should be yielding anything are
> > > ones that are sending large files, doing server push, or explicitly
> > > *desire* to be interleaved in such fashion.
> > >
> >
> >But they have no way to know if the server supports this,
>
> If it's a WSGI-compliant server, it supports this by
> definition.  It's just that synchronous servers don't pause before
> requesting the next iteration.
>
>
> >  and existing
> >WSGI implementations does not interleave the iteration, as far as I know.
>
> Nothing in the spec stops them from doing so - indeed, they're
> *encouraged* to do so:
>
> http://www.python.org/dev/peps/pep-0333/#middleware-handling-of-block-boundaries
>
> """This requirement ensures that asynchronous applications and
> servers can conspire to reduce the number of threads that are
> required to run a given number of application instances simultaneously."""
>
> Notice that the only way this sentence works is if you are
> interleaving applications.
>
> That being said, the PEP really needs an explicit discussion of the
> execution model.

Is there a means to support a non blocking read on wsgi.input ?

Eg.,

for data in environ['wsgi.input']:
    if not data:
        if nothing_else_to_do:
            yield environ['wsgi.input'] # Wake me when there is more data
        else:
            do_domething()
            yield '' # wake me next time arround, irrespective of
whether there is data

From pje at telecommunity.com  Fri Oct  5 19:02:04 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 05 Oct 2007 13:02:04 -0400
Subject: [Web-SIG] yield considered harmful (was: x-wsgiorg.flush)
In-Reply-To: <470650E7.4050809@dealmeida.net>
References: <4703ADE1.5040507@libero.it>
	<20071003192817.3014C3A407A@sparrow.telecommunity.com>
	<4703F2E1.9050402@libero.it>
	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>
	<88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com>
	<20071004114441.C7B103A407A@sparrow.telecommunity.com>
	<88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com>
	<20071004130818.BFCE83A407A@sparrow.telecommunity.com>
	<470613B0.8000101@libero.it>
	<20071005143100.07AD63A407C@sparrow.telecommunity.com>
	<470650E7.4050809@dealmeida.net>
Message-ID: <20071005165925.0E21A3A407B@sparrow.telecommunity.com>

At 11:57 AM 10/5/2007 -0300, Rob De Almeida wrote:
>Phillip J. Eby wrote:
>>I mean that you can't write a WSGI 2.0 application using a single 
>>generator function, because it has to return a tuple, not an 
>>iterator.  This will discourage people from thinking "yield" is a 
>>good way to build up their output, instead of using a StringIO or 
>>''.join() on a list of strings.
>
>Could you explain why using 'yield' is not recommended? Just 
>curious, because I use it all the time.

Because you're slowing down your application's throughput.  The only 
reasons to yield multiple strings is when you are either:

1. Sending a file that's larger than you want to load into memory, or

2. You're doing "server push" and need to do some processing between payloads.


From pje at telecommunity.com  Fri Oct  5 19:31:18 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 05 Oct 2007 13:31:18 -0400
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <470654BA.9050100@libero.it>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<470516B0.9010605@libero.it>
	<20071004165333.4D5353A407A@sparrow.telecommunity.com>
	<47051BCA.7090709@libero.it>
	<20071004174513.A4F0F3A407A@sparrow.telecommunity.com>
	<470528A9.3050108@libero.it>
	<20071005002423.320413A407A@sparrow.telecommunity.com>
	<470614CA.8000300@libero.it>
	<20071005143356.B8B7D3A407C@sparrow.telecommunity.com>
	<470654BA.9050100@libero.it>
Message-ID: <20071005172839.7C4853A407B@sparrow.telecommunity.com>

At 05:14 PM 10/5/2007 +0200, Manlio Perillo wrote:
>Phillip J. Eby ha scritto:
> > At 12:41 PM 10/5/2007 +0200, Manlio Perillo wrote:
> >> Phillip J. Eby ha scritto:
> >> > In other words, those flags were to support legacy frameworks detecting
> >> > that they were in an incompatible hosting environment.  However, IIUC,
> >> > there is no such existing framework that could meaningfully use the
> >> flag
> >> > you're proposing, that has any real chance of being portable to
> >> > different WSGI environments.
> >>
> >> This is true, but I continue to think that it is worth adding that flag.
> >> Asynchronous support is available in Nginx mod_wsgi, and in the future
> >> someone can implement a WSGI gateway for lighttpd.
> >
> > Right now, the definition of the flag is not sufficiently defined for my
> > taste.  You have only proposed that it be set to indicate that
> > interleaved execution is possible -- but it is *always* possible to have
> > interleaved execution in WSGI 1.0, so the only reason to add the flag to
> > WSGI 2.0 would be so a server could promise NOT to interleave
> > execution.  And what good is that?
> >
>
>Ok, here is more useful definition.
>
>If wsgi.asynchronous evaluates to true, then the WSGI application *will*
>be executed into the server main process cycle and thus the application
>execution *will* be interleaved (since this is the only way to support
>multiple concurrent requests).

I still don't see how this is *useful*.  What will the application 
*do* with this information?


From pje at telecommunity.com  Fri Oct  5 19:35:33 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 05 Oct 2007 13:35:33 -0400
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <bcf87d920710050934i21467943m9c578cfde404b3c@mail.gmail.com
 >
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<bcf87d920710050934i21467943m9c578cfde404b3c@mail.gmail.com>
Message-ID: <20071005173253.86D293A407B@sparrow.telecommunity.com>

At 05:34 PM 10/5/2007 +0100, Robin Bryce wrote:
>Is there a means to support a non blocking read on wsgi.input ?

No.  Some ideas have been proposed, but nobody has shown a practical 
scenario where it is useful.

For it to be useful, you would have to have an asynchronous server 
that is interleaving in its main thread, and therefore requires 
applications to be non-blocking.

However, to run "normal" WSGI applications, such a server has to 
*allow* them to block, so it is going to have to run them in a 
different thread anyway.

This is why the whole idea of creating an async *variant* of WSGI is 
moot - an async WSGI protocol is essentially 100% incompatible with 
synchronous WSGI, since any async WSGI components can't use 
synchronous WSGI components, unless they spawn another thread or process.

The whole thing is an exercise in futility, until/unless there is 
more than one such server and application, at which point they could 
get together and create AWSGI or WSGI-A or something of that sort.


From manlio_perillo at libero.it  Fri Oct  5 19:38:00 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Fri, 05 Oct 2007 19:38:00 +0200
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <20071005172839.7C4853A407B@sparrow.telecommunity.com>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<470516B0.9010605@libero.it>
	<20071004165333.4D5353A407A@sparrow.telecommunity.com>
	<47051BCA.7090709@libero.it>
	<20071004174513.A4F0F3A407A@sparrow.telecommunity.com>
	<470528A9.3050108@libero.it>
	<20071005002423.320413A407A@sparrow.telecommunity.com>
	<470614CA.8000300@libero.it>
	<20071005143356.B8B7D3A407C@sparrow.telecommunity.com>
	<470654BA.9050100@libero.it>
	<20071005172839.7C4853A407B@sparrow.telecommunity.com>
Message-ID: <47067678.1040809@libero.it>

Phillip J. Eby ha scritto:
> At 05:14 PM 10/5/2007 +0200, Manlio Perillo wrote:
>> Phillip J. Eby ha scritto:
>> > At 12:41 PM 10/5/2007 +0200, Manlio Perillo wrote:
>> >> Phillip J. Eby ha scritto:
>> >> > In other words, those flags were to support legacy frameworks 
>> detecting
>> >> > that they were in an incompatible hosting environment.  However, 
>> IIUC,
>> >> > there is no such existing framework that could meaningfully use the
>> >> flag
>> >> > you're proposing, that has any real chance of being portable to
>> >> > different WSGI environments.
>> >>
>> >> This is true, but I continue to think that it is worth adding that 
>> flag.
>> >> Asynchronous support is available in Nginx mod_wsgi, and in the future
>> >> someone can implement a WSGI gateway for lighttpd.
>> >
>> > Right now, the definition of the flag is not sufficiently defined 
>> for my
>> > taste.  You have only proposed that it be set to indicate that
>> > interleaved execution is possible -- but it is *always* possible to 
>> have
>> > interleaved execution in WSGI 1.0, so the only reason to add the 
>> flag to
>> > WSGI 2.0 would be so a server could promise NOT to interleave
>> > execution.  And what good is that?
>> >
>>
>> Ok, here is more useful definition.
>>
>> If wsgi.asynchronous evaluates to true, then the WSGI application *will*
>> be executed into the server main process cycle and thus the application
>> execution *will* be interleaved (since this is the only way to support
>> multiple concurrent requests).
> 
> I still don't see how this is *useful*.  What will the application *do* 
> with this information?
> 

As an example (not tested) SQLAlchemy can implements a 
RequestSingletonPool, that is the equivalend of ThreadSingetonPool.

In this case the pool will checkout a connection using the 
environ['wsgi.request_id'] identifier (unique for each request), instead 
of thread.get_ident.

So, a WSGI application *needs* to know if the application is 
multithreaded or asynchronous to select the right connection pool.


Regards   Manlio Perillo

From pje at telecommunity.com  Fri Oct  5 21:11:56 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 05 Oct 2007 15:11:56 -0400
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <47067678.1040809@libero.it>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<470516B0.9010605@libero.it>
	<20071004165333.4D5353A407A@sparrow.telecommunity.com>
	<47051BCA.7090709@libero.it>
	<20071004174513.A4F0F3A407A@sparrow.telecommunity.com>
	<470528A9.3050108@libero.it>
	<20071005002423.320413A407A@sparrow.telecommunity.com>
	<470614CA.8000300@libero.it>
	<20071005143356.B8B7D3A407C@sparrow.telecommunity.com>
	<470654BA.9050100@libero.it>
	<20071005172839.7C4853A407B@sparrow.telecommunity.com>
	<47067678.1040809@libero.it>
Message-ID: <20071005190917.7758D3A407B@sparrow.telecommunity.com>

At 07:38 PM 10/5/2007 +0200, Manlio Perillo wrote:
>Phillip J. Eby ha scritto:
> > I still don't see how this is *useful*.  What will the application *do*
> > with this information?
>
>As an example (not tested) SQLAlchemy can implements a
>RequestSingletonPool, that is the equivalend of ThreadSingetonPool.
>
>In this case the pool will checkout a connection using the
>environ['wsgi.request_id'] identifier (unique for each request), instead
>of thread.get_ident.

I still don't see the point of this.  Why can't the application just 
keep a reference to the connection object it's using?  That doesn't 
require any new code and already works now in every existing WSGI 
server.  Why write code that is more complex to do something that you 
don't even need?

Not only that, but the ONLY reasons for the application to yield are 
if it's sending something too big to fit in memory, or it's doing 
server push (or otherwise wants to stream the content).

Such applications are extremely rare to begin with, or should be.  If 
you are seeing applications that yield multiple strings and *aren't* 
one of these use cases, it indicates that the application author 
doesn't understand the WSGI spec, and doesn't realize they're slowing 
down their application by doing it.  Yields are for streaming, and 
most web applications shouldn't be streaming.

That means that 99.9% of all WSGI applications should never produce 
more than one output string -- which means that the "interleaving" 
question never even comes up.  The applications that produce multiple 
output strings have to deal with the complexity of the situation anyway.


From robinbryce at gmail.com  Fri Oct  5 23:13:29 2007
From: robinbryce at gmail.com (Robin Bryce)
Date: Fri, 5 Oct 2007 22:13:29 +0100
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <20071005173253.86D293A407B@sparrow.telecommunity.com>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<bcf87d920710050934i21467943m9c578cfde404b3c@mail.gmail.com>
	<20071005173253.86D293A407B@sparrow.telecommunity.com>
Message-ID: <bcf87d920710051413p1aed82c7kc1886e84aaa55f85@mail.gmail.com>

On 05/10/2007, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 05:34 PM 10/5/2007 +0100, Robin Bryce wrote:
> >Is there a means to support a non blocking read on wsgi.input ?
>
> No.  Some ideas have been proposed, but nobody has shown a practical
> scenario where it is useful.
>
> For it to be useful, you would have to have an asynchronous server
> that is interleaving in its main thread, and therefore requires
> applications to be non-blocking.

It requires asynchronous parts of the wsgi stack to co-operate with
the server in order to deal with requests which end up being processed
(or part processed) by synchronous components.

A requirement to be able to process *some* requests synchronously -
for a particular connection - should not prevent a server from
supporting both async & synchronous models of processing

>
> However, to run "normal" WSGI applications, such a server has to
> *allow* them to block, so it is going to have to run them in a
> different thread anyway.

Yes.

>
> This is why the whole idea of creating an async *variant* of WSGI is
> moot - an async WSGI protocol is essentially 100% incompatible with
> synchronous WSGI, since any async WSGI components can't use
> synchronous WSGI components, unless they spawn another thread or process.

This does not have to be the case. All synchronous wsgi components
require the presence of wsgi.input which behaves as specified in
pep-333.

No wsgi async *aware* components exist, because pep-333 does not allow it.

async *aware* components, like async servers in general, should be
willing to accept greater complexity in the interface. With some
additional complexity, exposed WSG 2.0 async aware components, I can't
see any reason wsgi 2.0 can't allow for both - provided that async
aware components always live at the top of the wsgi stack.


Here is my stab at it:

Let the async server provide

environ['wsgi.async_input']

Some to be agreed non-blocking, iterative, interface to the *content*
of a single request. It is legal for an async aware component to call
environ['wsgi.async_input'].next(), at most, once for each value of
response data it yields. Note that it need not call async_input.next()
every time it is resumed.

And substitute wsgi.async_input for wsgi.input in my previous message.

environ['wsgi.input_factory']

A callable. Which MUST be called by an application which wishes to
switch to synchronous processing for the remainder of the current
requests content. The application must yield the return value of this
factory as the next value it produces. The next time the application
is resumed the environ will contain a pep-333 compatible wsgi.input
environ key. Applications which call this function MUST accommodate
the possibility that that they will be resumed in a different thread
from that in which they called wsgi.input_factory


Let the server define its own interface for thread / process
interaction and provide it via server specific environ keys and expose
it through server specific environ keys.

Require, as MUST, that the server implementation provides a middle
ware component which uses that server specific api to support
wsgi.input_factory.

Perhaps *disallow* all but the top most wsgi application in the stack
from interacting with the server specific threading api.

Perhaps define a wsgi.resume_with_result callable such that it can be
leveraged *only* by async aware wsgi components - it lets async aware
components delegate a callable for execution in a different thread


With respect to wsgi.input its helps (me at any rate) to remember that
even an async server can not possibly proceed with the next request
until it knows it has read (up to or past) the end of the current
requests content boundary.

WSGI is defined at the per request level there is no need for the
async/sync middle ware bridge to 'push back' data. The server sees
both Content:close, Content-Length etc, and so can arrange for
wsgi.async_input to respect the boundaries.

I believe this would be enough to support an asynchronous
implementation of Comet.

http://en.wikipedia.org/wiki/Comet_%28programming%29 and
http://rphd.sourceforge.net/


This sketch is not completely shot from the hip. I have an async
server implementation (hey who hasn't these days) which I used mainly
as a means to explore *how* a server could possibly interact with an
async aware wsgi stack. See
http://svn.wiretooth.com/svn/open/asycamore/trunk/asycamore/

and in particular in httpconnectioncontext.py
   WSGIServiceContext.start_request
   HTTPServiceContext.continue_reading

It does not implement the above sketch but *could* easily do so.


> The whole thing is an exercise in futility, until/unless there is
> more than one such server and application, at which point they could
> get together and create AWSGI or WSGI-A or something of that sort.
>
>

That's to much chicken/egg for my tastes. All you are really saying is
that the CGI model covers the majority of 'common' use cases. I don't
know of anyone who would disagree with this. But as things stand all
wsgi-ish implementations which aim to support async/sync are consigned
to the dust bin of 'non conformant'. This acts as a strong
disincentive to experiment and innovate.

If, for clear technical reasons, nothing can be done so support mixing
async aware and synchronous applications in WSGI 2.0, then so it goes.

If it can't be done without imposing significant complexity on
applications that are perfectly happy with the highly successful wsgi
1.0 model, then fair enough - WSGI-A is a non starter.

Or are you against introducing features to support async servers and
composition of mixed async/sync stacks on principle ?

If a collective decision is made that WSGI will only ever support half
async (blocking read, asynchronous response) then both the pep and the
new spec should state this very clearly indeed.

Best,
Robin

From pje at telecommunity.com  Sat Oct  6 00:07:35 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri, 05 Oct 2007 18:07:35 -0400
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <bcf87d920710051413p1aed82c7kc1886e84aaa55f85@mail.gmail.co
 m>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<bcf87d920710050934i21467943m9c578cfde404b3c@mail.gmail.com>
	<20071005173253.86D293A407B@sparrow.telecommunity.com>
	<bcf87d920710051413p1aed82c7kc1886e84aaa55f85@mail.gmail.com>
Message-ID: <20071005220455.1ABB23A407B@sparrow.telecommunity.com>

At 10:13 PM 10/5/2007 +0100, Robin Bryce wrote:
>That's to much chicken/egg for my tastes. All you are really saying is
>that the CGI model covers the majority of 'common' use cases. I don't
>know of anyone who would disagree with this. But as things stand all
>wsgi-ish implementations which aim to support async/sync are consigned
>to the dust bin of 'non conformant'. This acts as a strong
>disincentive to experiment and innovate.
>
>If, for clear technical reasons, nothing can be done so support mixing
>async aware and synchronous applications in WSGI 2.0, then so it goes.
>
>If it can't be done without imposing significant complexity on
>applications that are perfectly happy with the highly successful wsgi
>1.0 model, then fair enough - WSGI-A is a non starter.
>
>Or are you against introducing features to support async servers and
>composition of mixed async/sync stacks on principle ?

Not in *principle*, only in practice.  :)  If you read the archives 
of a few years back, I was rather enthusiastic until I realized that 
there really wasn't any way to make it of practical benefit.

See, in order for a server to take advantage of an application's 
"asynchronous" nature, the server has to *know* the application won't 
"block".  That is, the app has to *promise* not to block.  (Because 
without this promise, the server is forced to run the app in a 
separate thread or process, so as not to block the server.)

But in order for the app to make this promise, it can only use 
components that either make the same promise, unless it runs *them* 
in other threads or processes...  which means giving up on easily 
composing applications from multiple WSGI components.

So far, discussion on this matter has hinged on the claim that it's 
*possible* to make such mixed stacks, and I don't disagree.  What 
nobody has shown is that it's 1. practical, and 2. produces some 
actual benefit, compared to the synchronous model now in use.  As a 
practical matter, the vast majority of Python web applications and 
frameworks are synchronous by nature, and those that aren't are 
already tied to a specific async API.

If we were going to try to implement an asynchronous WSGI, what we 
would *really* need to do is discard the app_iter and make write() 
the standard way of sending the body.  This would let us implement a 
CPS (continuation-passing style) API.  We would also have to change 
the input stream so that instead of reading from it, we instead 
passed it functions to be called when input was available, and so 
on.  We would also need a way to tell write() that we were finished 
writing, and some way to manage connection timeouts.

Unfortunately, this programming style is verbose and more difficult 
to learn for people versed in less "twisted" ways of programming.  To 
write middleware in this style, you also need to write deeply nested 
functions.  And synchronous servers would need to figure out what to 
do when an application returns without having called start_response() 
yet or figured out how to close the stream.

Anyway, my point here is that I see how we could either cater to 
synchronous apps or async apps in a given API.  But throwing a 
half-baked async API on top of a synchronous one is just making a 
mess and helping no-one.

To sketch a WSGI-A application:

     def app(environ, start_response)
         start_response('200 Cool', [('content-type','text/plain')])
         write('Hello world!')
         write(None)  # close

And a WSGI 1->WSGI A converter:

     class ReadCallbackWrapper:
         def __init__(self, stream):
             self.stream = stream
         def on_read(self, size, callback):
             callback(self.stream.read(size))

     def wsgi_1_app(environ, start_response):
         running = [1]
         def sr(*args):
             write = sr(*args)
             def w(arg):
                 if running:
                     if arg is None:
                         running.pop()
                     else:
                         write(arg)
                 else:
                     raise RuntimeError("Already closed!")
             return w
         environ['wsgi.input'] = ReadCallbackWrapper(environ['wsgi.input'])
         wsgi_a_app(environ, sr)
         while running:
             pass   # really should have a timeout check here
         return []


This highlights the essential difference between a sync and async 
API: the sync API either finishes right away or returns something the 
server calls until it's exhausted.  An async API offers no guarantee 
that anything has been done when the app is called.  Anything could 
happen at any time later.

My gut feel is that it's harder to write middleware for WSGI-A style 
of API, because you have to do at least doubly nested functions if 
you're dealing with the output at all (as this example shows).

And if we mix modes, then we have this sort of messy back-and-forth 
adaptation in between.  And as best I can tell, the proposal for a 
mixed-mode API that you gave would actually make it even *harder* 
than this to write WSGI middleware, as there would be similar 
boundary issues for the input stream.


From renesd at gmail.com  Sat Oct  6 05:07:08 2007
From: renesd at gmail.com (=?ISO-8859-1?Q?Ren=E9_Dudfield?=)
Date: Sat, 6 Oct 2007 13:07:08 +1000
Subject: [Web-SIG] yield considered harmful (was: x-wsgiorg.flush)
In-Reply-To: <20071005165925.0E21A3A407B@sparrow.telecommunity.com>
References: <4703ADE1.5040507@libero.it>
	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>
	<88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com>
	<20071004114441.C7B103A407A@sparrow.telecommunity.com>
	<88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com>
	<20071004130818.BFCE83A407A@sparrow.telecommunity.com>
	<470613B0.8000101@libero.it>
	<20071005143100.07AD63A407C@sparrow.telecommunity.com>
	<470650E7.4050809@dealmeida.net>
	<20071005165925.0E21A3A407B@sparrow.telecommunity.com>
Message-ID: <64ddb72c0710052007y3a84eb29wd42aa67d0ec84744@mail.gmail.com>

I think 'streaming' is good for speeding up web pages when processing
takes a while.

I'll explain why...

Say your page takes 0.2 seconds to process.

If you wait until 0.2 seconds is up, then the first bytes that will
come to the browser will arrive in at least 0.2 seconds.  Whereas if
you send data as soon as its ready, then the user will be able to see
some of that data more quickly - and possibly make more requests
sooner.

However if your application can not send data until it is all ready
anyway - which is the way with most python templating languages - then
you might as well send it all in one go.  Sending it all in one go is
faster, unless you can send data as a stream.

Sending the header of a html page right away is often very quick for
dynamic pages.  Since often that part is static - and it contains
links to other files - like css, js, and image files.  So yielding the
header part, then doing your database connection, and page
construction which takes longer will almost always be faster for the
user - than waiting for the entire page to be ready.


On 10/6/07, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 11:57 AM 10/5/2007 -0300, Rob De Almeida wrote:
> >Phillip J. Eby wrote:
> >>I mean that you can't write a WSGI 2.0 application using a single
> >>generator function, because it has to return a tuple, not an
> >>iterator.  This will discourage people from thinking "yield" is a
> >>good way to build up their output, instead of using a StringIO or
> >>''.join() on a list of strings.
> >
> >Could you explain why using 'yield' is not recommended? Just
> >curious, because I use it all the time.
>
> Because you're slowing down your application's throughput.  The only
> reasons to yield multiple strings is when you are either:
>
> 1. Sending a file that's larger than you want to load into memory, or
>
> 2. You're doing "server push" and need to do some processing between payloads.
>
> _______________________________________________
> Web-SIG mailing list
> Web-SIG at python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/renesd%40gmail.com
>

From pje at telecommunity.com  Sat Oct  6 08:03:02 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat, 06 Oct 2007 02:03:02 -0400
Subject: [Web-SIG] yield considered harmful (was: x-wsgiorg.flush)
In-Reply-To: <64ddb72c0710052007y3a84eb29wd42aa67d0ec84744@mail.gmail.co
 m>
References: <4703ADE1.5040507@libero.it>
	<20071003230812.7A7F63A407A@sparrow.telecommunity.com>
	<88e286470710031930u6e91628ey168fc2fc0e21d7bc@mail.gmail.com>
	<20071004114441.C7B103A407A@sparrow.telecommunity.com>
	<88e286470710040520x2ee3ba06q5a531e222e9938c4@mail.gmail.com>
	<20071004130818.BFCE83A407A@sparrow.telecommunity.com>
	<470613B0.8000101@libero.it>
	<20071005143100.07AD63A407C@sparrow.telecommunity.com>
	<470650E7.4050809@dealmeida.net>
	<20071005165925.0E21A3A407B@sparrow.telecommunity.com>
	<64ddb72c0710052007y3a84eb29wd42aa67d0ec84744@mail.gmail.com>
Message-ID: <20071006060023.CEBB53A407B@sparrow.telecommunity.com>

At 01:07 PM 10/6/2007 +1000, Ren? Dudfield wrote:
>I think 'streaming' is good for speeding up web pages when processing
>takes a while.
>
>I'll explain why...
>
>Say your page takes 0.2 seconds to process.
>
>If you wait until 0.2 seconds is up, then the first bytes that will
>come to the browser will arrive in at least 0.2 seconds.  Whereas if
>you send data as soon as its ready, then the user will be able to see
>some of that data more quickly - and possibly make more requests
>sooner.

It's faster for the user, but not necessarily for the server.  The 
server will do more system calls, and the CPU will do more context 
switches.  So, if you're going to stream for purposes of 
responsiveness, you're going to be trading off against overall server 
throughput.

Nonetheless, the pages where you even have the choice of streaming 
are infrequent.  Most of the examples I see of people doing streaming 
are completely worthless, because there isn't any non-trivial 
computation taking place between the yields.


From manlio_perillo at libero.it  Sat Oct  6 11:04:23 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Sat, 06 Oct 2007 11:04:23 +0200
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <20071005220455.1ABB23A407B@sparrow.telecommunity.com>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<bcf87d920710050934i21467943m9c578cfde404b3c@mail.gmail.com>
	<20071005173253.86D293A407B@sparrow.telecommunity.com>
	<bcf87d920710051413p1aed82c7kc1886e84aaa55f85@mail.gmail.com>
	<20071005220455.1ABB23A407B@sparrow.telecommunity.com>
Message-ID: <47074F97.8040604@libero.it>

Phillip J. Eby ha scritto:
> At 10:13 PM 10/5/2007 +0100, Robin Bryce wrote:
>> That's to much chicken/egg for my tastes. All you are really saying is
>> that the CGI model covers the majority of 'common' use cases. I don't
>> know of anyone who would disagree with this. But as things stand all
>> wsgi-ish implementations which aim to support async/sync are consigned
>> to the dust bin of 'non conformant'. This acts as a strong
>> disincentive to experiment and innovate.
>>
>> If, for clear technical reasons, nothing can be done so support mixing
>> async aware and synchronous applications in WSGI 2.0, then so it goes.
>>

I don't see the reason to mix async and sync applications, in the same 
way that it is not possible to mix a thread unsafe application with a 
threaded server.

WSGI should just *allow* asynchronous applications and middlewares to to 
their job.

As an example, the WSGI write callable cannot be implemented in a 
conforming way in Nginx.

However, if we can allow the write callable to raise an EAGAIN exception 
when the buffer cannot be written to the socket, with the requirement 
that the WSGI application, in this case, MUST return control to the 
server (yielding an empty string as an example), then the write callable 
can be implemented.

 > [...]
> 
> If we were going to try to implement an asynchronous WSGI, what we would 
> *really* need to do is discard the app_iter and make write() the 
> standard way of sending the body.  This would let us implement a CPS 
> (continuation-passing style) API.  

But isn't this possible just using a generator?


> We would also have to change the 
> input stream so that instead of reading from it, we instead passed it 
> functions to be called when input was available, 

Another possible solution is that reading from input is allowed to raise 
an EAGAIN exception, like in the previous example.

 > [...]


Regards  Manlio Perilo

From robinbryce at gmail.com  Sat Oct  6 14:23:49 2007
From: robinbryce at gmail.com (Robin Bryce)
Date: Sat, 6 Oct 2007 13:23:49 +0100
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <47074F97.8040604@libero.it>
References: <4704222D.30208@colorstudy.com> <4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<bcf87d920710050934i21467943m9c578cfde404b3c@mail.gmail.com>
	<20071005173253.86D293A407B@sparrow.telecommunity.com>
	<bcf87d920710051413p1aed82c7kc1886e84aaa55f85@mail.gmail.com>
	<20071005220455.1ABB23A407B@sparrow.telecommunity.com>
	<47074F97.8040604@libero.it>
Message-ID: <bcf87d920710060523s49170ff5i78f2f8670ef89c82@mail.gmail.com>

On 06/10/2007, Manlio Perillo <manlio_perillo at libero.it> wrote:
> Phillip J. Eby ha scritto:
> > At 10:13 PM 10/5/2007 +0100, Robin Bryce wrote:
> >> That's to much chicken/egg for my tastes. All you are really saying is
> >> that the CGI model covers the majority of 'common' use cases. I don't
> >> know of anyone who would disagree with this. But as things stand all
> >> wsgi-ish implementations which aim to support async/sync are consigned
> >> to the dust bin of 'non conformant'. This acts as a strong
> >> disincentive to experiment and innovate.
> >>
> >> If, for clear technical reasons, nothing can be done so support mixing
> >> async aware and synchronous applications in WSGI 2.0, then so it goes.
> >>
>
> I don't see the reason to mix async and sync applications, in the same
> way that it is not possible to mix a thread unsafe application with a
> threaded server.
>
> WSGI should just *allow* asynchronous applications and middlewares to to
> their job.
>
> As an example, the WSGI write callable cannot be implemented in a
> conforming way in Nginx.
>
> However, if we can allow the write callable to raise an EAGAIN exception
> when the buffer cannot be written to the socket, with the requirement
> that the WSGI application, in this case, MUST return control to the
> server (yielding an empty string as an example), then the write callable
> can be implemented.
>
>  > [...]
> >
> > If we were going to try to implement an asynchronous WSGI, what we would
> > *really* need to do is discard the app_iter and make write() the
> > standard way of sending the body.  This would let us implement a CPS
> > (continuation-passing style) API.
>
> But isn't this possible just using a generator?
>
>
> > We would also have to change the
> > input stream so that instead of reading from it, we instead passed it
> > functions to be called when input was available,
>
> Another possible solution is that reading from input is allowed to raise
> an EAGAIN exception, like in the previous example.
>
>  > [...]
>
>
>
> Regards  Manlio Perilo
> _______________________________________________
> Web-SIG mailing list
> Web-SIG at python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/robinbryce%40gmail.com
>

From robinbryce at gmail.com  Sat Oct  6 14:34:10 2007
From: robinbryce at gmail.com (Robin Bryce)
Date: Sat, 6 Oct 2007 13:34:10 +0100
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <47074F97.8040604@libero.it>
References: <4704222D.30208@colorstudy.com> <4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<bcf87d920710050934i21467943m9c578cfde404b3c@mail.gmail.com>
	<20071005173253.86D293A407B@sparrow.telecommunity.com>
	<bcf87d920710051413p1aed82c7kc1886e84aaa55f85@mail.gmail.com>
	<20071005220455.1ABB23A407B@sparrow.telecommunity.com>
	<47074F97.8040604@libero.it>
Message-ID: <bcf87d920710060534n2e634079gf37e5be4d05fc8e2@mail.gmail.com>

Ignore last, over sensitive laptop touch pad :)

On 06/10/2007, Manlio Perillo <manlio_perillo at libero.it> wrote:
> Phillip J. Eby ha scritto:
> > At 10:13 PM 10/5/2007 +0100, Robin Bryce wrote:
> >> That's to much chicken/egg for my tastes. All you are really saying is
> >> that the CGI model covers the majority of 'common' use cases. I don't
> >> know of anyone who would disagree with this. But as things stand all
> >> wsgi-ish implementations which aim to support async/sync are consigned
> >> to the dust bin of 'non conformant'. This acts as a strong
> >> disincentive to experiment and innovate.
> >>
> >> If, for clear technical reasons, nothing can be done so support mixing
> >> async aware and synchronous applications in WSGI 2.0, then so it goes.
> >>
>
> I don't see the reason to mix async and sync applications, in the same
> way that it is not possible to mix a thread unsafe application with a
> threaded server.
>

There are plenty of stateless synchronous wsgi components out there
that I would like the option of serving as is. As the person choosing
the components in my wsgi stack I'm perfectly capable of deciding
whether such a synchronous app is safe in the context of an asynch
server.

From robinbryce at gmail.com  Sat Oct  6 16:33:01 2007
From: robinbryce at gmail.com (Robin Bryce)
Date: Sat, 6 Oct 2007 15:33:01 +0100
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <20071005220455.1ABB23A407B@sparrow.telecommunity.com>
References: <4704222D.30208@colorstudy.com>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<bcf87d920710050934i21467943m9c578cfde404b3c@mail.gmail.com>
	<20071005173253.86D293A407B@sparrow.telecommunity.com>
	<bcf87d920710051413p1aed82c7kc1886e84aaa55f85@mail.gmail.com>
	<20071005220455.1ABB23A407B@sparrow.telecommunity.com>
Message-ID: <bcf87d920710060733k4570a183k2889bf13e53df13a@mail.gmail.com>

On 05/10/2007, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 10:13 PM 10/5/2007 +0100, Robin Bryce wrote:
> >That's to much chicken/egg for my tastes. All you are really saying is
> >that the CGI model covers the majority of 'common' use cases. I don't
> >know of anyone who would disagree with this. But as things stand all
> >wsgi-ish implementations which aim to support async/sync are consigned
> >to the dust bin of 'non conformant'. This acts as a strong
> >disincentive to experiment and innovate.
> >
> >If, for clear technical reasons, nothing can be done so support mixing
> >async aware and synchronous applications in WSGI 2.0, then so it goes.
> >
> >If it can't be done without imposing significant complexity on
> >applications that are perfectly happy with the highly successful wsgi
> >1.0 model, then fair enough - WSGI-A is a non starter.
> >
> >Or are you against introducing features to support async servers and
> >composition of mixed async/sync stacks on principle ?
>
> Not in *principle*, only in practice.  :)  If you read the archives
> of a few years back, I was rather enthusiastic until I realized that
> there really wasn't any way to make it of practical benefit.

I have tried to follow the history of "we want more asynch support in
wsgi" but I don't think I've kept up with you on this.

> See, in order for a server to take advantage of an application's
> "asynchronous" nature, the server has to *know* the application won't
> "block".  That is, the app has to *promise* not to block.  (Because
> without this promise, the server is forced to run the app in a
> separate thread or process, so as not to block the server.)
>
> But in order for the app to make this promise, it can only use
> components that either make the same promise, unless it runs *them*
> in other threads or processes...  which means giving up on easily
> composing applications from multiple WSGI components.
>

Which is why I drew a distinction between async *aware* components and
others and advocated a composition model in which the composer of the
wsgi stack must guarantee that async aware components live at the top.
Ie, a synchronous component can not sensibly be provided with a means
to drive an async aware component.

This places the burden of the composition problem firmly on the server
and those components writen specifically to be async aware and yet
allows those components to take advantage synchronous components from
time to time.

> So far, discussion on this matter has hinged on the claim that it's
> *possible* to make such mixed stacks, and I don't disagree.  What
> nobody has shown is that it's 1. practical, and 2. produces some
> actual benefit, compared to the synchronous model now in use.  As a
> practical matter, the vast majority of Python web applications and
> frameworks are synchronous by nature, and those that aren't are
> already tied to a specific async API.
>
> If we were going to try to implement an asynchronous WSGI, what we
> would *really* need to do is discard the app_iter and make write()
> the standard way of sending the body.  This would let us implement a
> CPS (continuation-passing style) API.  We would also have to change
> the input stream so that instead of reading from it, we instead
> passed it functions to be called when input was available, and so
> on.  We would also need a way to tell write() that we were finished
> writing, and some way to manage connection timeouts.
>

I don't understand why you think this is necessary. I especially don't
like the thought that there is an argument that useful and performant
wsgi-a support is impossible without requiring use of CSP. I *like*
the app_iter model and believe it is perfectly workable for an async
component - provided that:

1. There is a non-blocking variant of wsgi.input say wsgi.async_input
2. There is a means for an async aware component to signal the server
that it should process the remainder of the current request in a
synchronous manner.
3. The server and async aware components are allowed to use an
extended set of yield values which provide the co-operative
communication necessary for performant async components.
   3a. A yield that means "don't resume me until there is more data
available on wsgi.async_input"
   3b. A yield that means "I ran out of data reading from
wsgi.async_input but please continue resuming me anyway as I have
useful work to do"

And a yield of the empty string means the same as it does for wsgi 1.0

3a & 3b allows the component to pass "up" the information that the
server needs to determine that the underlying socket has encountered
EAGAIN on recv. The async aware component *knows* what its last yield
was and so can reliably interpret resume after 3a as meaning "more
data available". After 3b it does no harm to the perfomance of the
server if the component speculatively attempts to read from
wsgi.async_input.

Absence of wsgi.input in the environ until the 'switch' takes place
will cause any accidentally included synchronous application to break
if it attempts to perform a blocking read on the input. An async
server should have no problem with synchronous applications that
*dont* use wsgi.input yes ?

> Unfortunately, this programming style is verbose and more difficult
> to learn for people versed in less "twisted" ways of programming.  To
> write middleware in this style, you also need to write deeply nested
> functions.  And synchronous servers would need to figure out what to
> do when an application returns without having called start_response()
> yet or figured out how to close the stream.

Agreed. I have always assumed that async aware components would be
incompatible with synchronous servers.


> Anyway, my point here is that I see how we could either cater to
> synchronous apps or async apps in a given API.  But throwing a
> half-baked async API on top of a synchronous one is just making a
> mess and helping no-one.

...

> My gut feel is that it's harder to write middleware for WSGI-A style
> of API, because you have to do at least doubly nested functions if
> you're dealing with the output at all (as this example shows).
>
> And if we mix modes, then we have this sort of messy back-and-forth
> adaptation in between.  And as best I can tell, the proposal for a
> mixed-mode API that you gave would actually make it even *harder*
> than this to write WSGI middleware, as there would be similar
> boundary issues for the input stream.

No I'm definitely not advocating mixed modes. I'm saying that I want a
means to allow an async aware component to switch the current request
to synchronous processing for the remainder of the request. And
explicitly _dont_ think its sensible to attempt to support synchronous
-> asynchronous. The only reason for supporting the switch at all is
to enable async aware components to leverage synchronous components
"from time to time".

Async aware components would be harder to write than synchronous but
synchronous components would remain as they are. And, by avoiding CSP,
asynchronous servers could freely leverage wsgi 1.0 style components
which don't consume wsgi.input

Perhaps I should attempt an asyncwsgiref, which by my definition
should be able to host apps in wsgiref but not the converse.

More to say but out of time for today.

Cheers,

Robin

From pje at telecommunity.com  Sat Oct  6 16:36:17 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat, 06 Oct 2007 10:36:17 -0400
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <47074F97.8040604@libero.it>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<bcf87d920710050934i21467943m9c578cfde404b3c@mail.gmail.com>
	<20071005173253.86D293A407B@sparrow.telecommunity.com>
	<bcf87d920710051413p1aed82c7kc1886e84aaa55f85@mail.gmail.com>
	<20071005220455.1ABB23A407B@sparrow.telecommunity.com>
	<47074F97.8040604@libero.it>
Message-ID: <20071006143337.D0BFF3A407A@sparrow.telecommunity.com>

At 11:04 AM 10/6/2007 +0200, Manlio Perillo wrote:
>As an example, the WSGI write callable cannot be implemented in a
>conforming way in Nginx.

...unless you use a separate thread or process.


> > If we were going to try to implement an asynchronous WSGI, what we would
> > *really* need to do is discard the app_iter and make write() the
> > standard way of sending the body.  This would let us implement a CPS
> > (continuation-passing style) API.
>
>But isn't this possible just using a generator?

No, because using a generator means there needs to be a separate 
callback to force the generator to be reiterated.  Hence the 
complexity of adding an async API to the existing WSGI model.


> > We would also have to change the
> > input stream so that instead of reading from it, we instead passed it
> > functions to be called when input was available,
>
>Another possible solution is that reading from input is allowed to raise
>an EAGAIN exception, like in the previous example.

Which is *way* more complex than the CPS approach.  If we're going to 
make it *harder* to write applications, there's no point to having a 
WSGI 2.0, since 1.0 is already hard enough to implement.  :)


From pje at telecommunity.com  Sat Oct  6 16:40:00 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat, 06 Oct 2007 10:40:00 -0400
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <bcf87d920710060534n2e634079gf37e5be4d05fc8e2@mail.gmail.co
 m>
References: <4704222D.30208@colorstudy.com> <4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<bcf87d920710050934i21467943m9c578cfde404b3c@mail.gmail.com>
	<20071005173253.86D293A407B@sparrow.telecommunity.com>
	<bcf87d920710051413p1aed82c7kc1886e84aaa55f85@mail.gmail.com>
	<20071005220455.1ABB23A407B@sparrow.telecommunity.com>
	<47074F97.8040604@libero.it>
	<bcf87d920710060534n2e634079gf37e5be4d05fc8e2@mail.gmail.com>
Message-ID: <20071006143721.7EC1F3A407A@sparrow.telecommunity.com>

At 01:34 PM 10/6/2007 +0100, Robin Bryce wrote:
>There are plenty of stateless synchronous wsgi components out there
>that I would like the option of serving as is. As the person choosing
>the components in my wsgi stack I'm perfectly capable of deciding
>whether such a synchronous app is safe in the context of an asynch
>server.

Only if you break encapsulation, composability, and scalability of 
construction by choosing to know how each and every component 
works.  The whole idea of a component is that you shouldn't HAVE TO 
know what components are being used inside of it.  Otherwise, it's 
not really a "component".


From manlio_perillo at libero.it  Sat Oct  6 17:48:40 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Sat, 06 Oct 2007 17:48:40 +0200
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <20071006143337.D0BFF3A407A@sparrow.telecommunity.com>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<bcf87d920710050934i21467943m9c578cfde404b3c@mail.gmail.com>
	<20071005173253.86D293A407B@sparrow.telecommunity.com>
	<bcf87d920710051413p1aed82c7kc1886e84aaa55f85@mail.gmail.com>
	<20071005220455.1ABB23A407B@sparrow.telecommunity.com>
	<47074F97.8040604@libero.it>
	<20071006143337.D0BFF3A407A@sparrow.telecommunity.com>
Message-ID: <4707AE58.3040003@libero.it>

Phillip J. Eby ha scritto:
> At 11:04 AM 10/6/2007 +0200, Manlio Perillo wrote:
>> As an example, the WSGI write callable cannot be implemented in a
>> conforming way in Nginx.
> 
> ...unless you use a separate thread or process.
> 

I'm insisting about asynchronous support in WSGI because
Nginx *does not supports threads*.

It has some thread support but it is *broken*.
Even if in future the problems are solved, the threading model of Nginx 
*will break* many existing WSGI applications, since the WSGI iteration 
can be resumed in different threads.

Of course, a WSGI application can use threads, but i think that it 
*needs* a wsgi.pause_output extension, for synchronization.

 > [...]
>> Another possible solution is that reading from input is allowed to raise
>> an EAGAIN exception, like in the previous example.
> 
> Which is *way* more complex than the CPS approach.  If we're going to 
> make it *harder* to write applications, there's no point to having a 
> WSGI 2.0, since 1.0 is already hard enough to implement.  :)
> 

It is a know fact that asynchronous programming is hard.
Multithread programming is even more harder, but nobody seems to care.


Regards  Manlio Perillo


From graham.dumpleton at gmail.com  Sun Oct  7 00:47:36 2007
From: graham.dumpleton at gmail.com (Graham Dumpleton)
Date: Sun, 7 Oct 2007 08:47:36 +1000
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <4707AE58.3040003@libero.it>
References: <4704222D.30208@colorstudy.com> <47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<bcf87d920710050934i21467943m9c578cfde404b3c@mail.gmail.com>
	<20071005173253.86D293A407B@sparrow.telecommunity.com>
	<bcf87d920710051413p1aed82c7kc1886e84aaa55f85@mail.gmail.com>
	<20071005220455.1ABB23A407B@sparrow.telecommunity.com>
	<47074F97.8040604@libero.it>
	<20071006143337.D0BFF3A407A@sparrow.telecommunity.com>
	<4707AE58.3040003@libero.it>
Message-ID: <88e286470710061547r373b431du7e77b51fc2083614@mail.gmail.com>

On 07/10/2007, Manlio Perillo <manlio_perillo at libero.it> wrote:
> Phillip J. Eby ha scritto:
> > At 11:04 AM 10/6/2007 +0200, Manlio Perillo wrote:
> >> As an example, the WSGI write callable cannot be implemented in a
> >> conforming way in Nginx.
> >
> > ...unless you use a separate thread or process.
> >
>
> I'm insisting about asynchronous support in WSGI because
> Nginx *does not supports threads*.
>
> It has some thread support but it is *broken*.
> Even if in future the problems are solved, the threading model of Nginx
> *will break* many existing WSGI applications, since the WSGI iteration
> can be resumed in different threads.
>
> Of course, a WSGI application can use threads, but i think that it
> *needs* a wsgi.pause_output extension, for synchronization.

I appreciate that you can't use the thread support in nginx, but what
I don't understand is why you can't utililise Python threading API (or
even POSIX threads) at the boundary between nginx and the interface
into the WSGI application, ie., in the WSGI adapter layer, so as to
emulate a synchronous style WSGI interface on top of the nginx event
driven layer. In other words you hide all the complexity of any queues
or other synchronisation mechanisms for communicating any data between
the two within the adapter. This way you do not need to expose an
asynchronous API to the WSGI application itself and existing WSGI code
can be used as is.

Graham

From pje at telecommunity.com  Sun Oct  7 05:42:11 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat, 06 Oct 2007 23:42:11 -0400
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <4707AE58.3040003@libero.it>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<bcf87d920710050934i21467943m9c578cfde404b3c@mail.gmail.com>
	<20071005173253.86D293A407B@sparrow.telecommunity.com>
	<bcf87d920710051413p1aed82c7kc1886e84aaa55f85@mail.gmail.com>
	<20071005220455.1ABB23A407B@sparrow.telecommunity.com>
	<47074F97.8040604@libero.it>
	<20071006143337.D0BFF3A407A@sparrow.telecommunity.com>
	<4707AE58.3040003@libero.it>
Message-ID: <20071007033932.2A44F3A407B@sparrow.telecommunity.com>

At 05:48 PM 10/6/2007 +0200, Manlio Perillo wrote:
>Phillip J. Eby ha scritto:
> > At 11:04 AM 10/6/2007 +0200, Manlio Perillo wrote:
> >> As an example, the WSGI write callable cannot be implemented in a
> >> conforming way in Nginx.
> >
> > ...unless you use a separate thread or process.
> >
>
>I'm insisting about asynchronous support in WSGI because
>Nginx *does not supports threads*.

Please note that this means you can't run WSGI applications in the 
same process, then, since WSGI applications can and do block - 
meaning that the server will stop serving requests.


From foom at fuhm.net  Sun Oct  7 08:45:46 2007
From: foom at fuhm.net (James Y Knight)
Date: Sun, 7 Oct 2007 02:45:46 -0400
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <bcf87d920710060733k4570a183k2889bf13e53df13a@mail.gmail.com>
References: <4704222D.30208@colorstudy.com>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<bcf87d920710050934i21467943m9c578cfde404b3c@mail.gmail.com>
	<20071005173253.86D293A407B@sparrow.telecommunity.com>
	<bcf87d920710051413p1aed82c7kc1886e84aaa55f85@mail.gmail.com>
	<20071005220455.1ABB23A407B@sparrow.telecommunity.com>
	<bcf87d920710060733k4570a183k2889bf13e53df13a@mail.gmail.com>
Message-ID: <56E0E17C-6BC0-440E-A980-49BC880EEFBA@fuhm.net>

On Oct 6, 2007, at 10:33 AM, Robin Bryce wrote:

> An async
> server should have no problem with synchronous applications that
> *dont* use wsgi.input yes ?

That's certainly not the case. One of the more popular things to do  
in a webapp is talk to a database. Most such accesses are done in a  
blocking fashion. Doing blocking database access in an asynchronous  
server's event loop is a pretty poor idea. I mean, sure, it'd  
probably "work", but your performance would be terrible...

James

From manlio_perillo at libero.it  Sun Oct  7 12:16:06 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Sun, 07 Oct 2007 12:16:06 +0200
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <88e286470710061547r373b431du7e77b51fc2083614@mail.gmail.com>
References: <4704222D.30208@colorstudy.com> <47050CC7.9030500@libero.it>	
	<20071004161810.060183A407A@sparrow.telecommunity.com>	
	<bcf87d920710050934i21467943m9c578cfde404b3c@mail.gmail.com>	
	<20071005173253.86D293A407B@sparrow.telecommunity.com>	
	<bcf87d920710051413p1aed82c7kc1886e84aaa55f85@mail.gmail.com>	
	<20071005220455.1ABB23A407B@sparrow.telecommunity.com>	
	<47074F97.8040604@libero.it>	
	<20071006143337.D0BFF3A407A@sparrow.telecommunity.com>	
	<4707AE58.3040003@libero.it>
	<88e286470710061547r373b431du7e77b51fc2083614@mail.gmail.com>
Message-ID: <4708B1E6.4090401@libero.it>

Graham Dumpleton ha scritto:
> On 07/10/2007, Manlio Perillo <manlio_perillo at libero.it> wrote:
>> Phillip J. Eby ha scritto:
>>> At 11:04 AM 10/6/2007 +0200, Manlio Perillo wrote:
>>>> As an example, the WSGI write callable cannot be implemented in a
>>>> conforming way in Nginx.
>>> ...unless you use a separate thread or process.
>>>
>> I'm insisting about asynchronous support in WSGI because
>> Nginx *does not supports threads*.
>>
>> It has some thread support but it is *broken*.
>> Even if in future the problems are solved, the threading model of Nginx
>> *will break* many existing WSGI applications, since the WSGI iteration
>> can be resumed in different threads.
>>
>> Of course, a WSGI application can use threads, but i think that it
>> *needs* a wsgi.pause_output extension, for synchronization.
> 
> I appreciate that you can't use the thread support in nginx, but what
> I don't understand is why you can't utililise Python threading API (or
> even POSIX threads) at the boundary between nginx and the interface
> into the WSGI application, ie., in the WSGI adapter layer, so as to
> emulate a synchronous style WSGI interface on top of the nginx event
> driven layer. 

This is possible, but I think that it is better to offer a basic 
asynchronous support in WSGI, since in this way it is possible to build 
threading support in pure Python *and*, more important, this support is 
reusable by other implementations.

> In other words you hide all the complexity of any queues
> or other synchronisation mechanisms for communicating any data between
> the two within the adapter. This way you do not need to expose an
> asynchronous API to the WSGI application itself and existing WSGI code
> can be used as is.
> 

The Python threading support can be implemented as a "middleware", so it 
is trasparent to the WSGI application.

Not sure if it can be called "middleware", however.


Regards  Manlio Perillo

From manlio_perillo at libero.it  Sun Oct  7 12:17:29 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Sun, 07 Oct 2007 12:17:29 +0200
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <20071007033932.2A44F3A407B@sparrow.telecommunity.com>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<bcf87d920710050934i21467943m9c578cfde404b3c@mail.gmail.com>
	<20071005173253.86D293A407B@sparrow.telecommunity.com>
	<bcf87d920710051413p1aed82c7kc1886e84aaa55f85@mail.gmail.com>
	<20071005220455.1ABB23A407B@sparrow.telecommunity.com>
	<47074F97.8040604@libero.it>
	<20071006143337.D0BFF3A407A@sparrow.telecommunity.com>
	<4707AE58.3040003@libero.it>
	<20071007033932.2A44F3A407B@sparrow.telecommunity.com>
Message-ID: <4708B239.6080008@libero.it>

Phillip J. Eby ha scritto:
> At 05:48 PM 10/6/2007 +0200, Manlio Perillo wrote:
>> Phillip J. Eby ha scritto:
>> > At 11:04 AM 10/6/2007 +0200, Manlio Perillo wrote:
>> >> As an example, the WSGI write callable cannot be implemented in a
>> >> conforming way in Nginx.
>> >
>> > ...unless you use a separate thread or process.
>> >
>>
>> I'm insisting about asynchronous support in WSGI because
>> Nginx *does not supports threads*.
> 
> Please note that this means you can't run WSGI applications in the same 
> process, then, since WSGI applications can and do block - meaning that 
> the server will stop serving requests.
> 

http://hg.mperillo.ath.cx/nginx/mod_wsgi/file/tip/README, in the Notes.


Regards  Manlio Perillo

From graham.dumpleton at gmail.com  Sun Oct  7 13:04:09 2007
From: graham.dumpleton at gmail.com (Graham Dumpleton)
Date: Sun, 7 Oct 2007 21:04:09 +1000
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <4708B1E6.4090401@libero.it>
References: <4704222D.30208@colorstudy.com>
	<bcf87d920710050934i21467943m9c578cfde404b3c@mail.gmail.com>
	<20071005173253.86D293A407B@sparrow.telecommunity.com>
	<bcf87d920710051413p1aed82c7kc1886e84aaa55f85@mail.gmail.com>
	<20071005220455.1ABB23A407B@sparrow.telecommunity.com>
	<47074F97.8040604@libero.it>
	<20071006143337.D0BFF3A407A@sparrow.telecommunity.com>
	<4707AE58.3040003@libero.it>
	<88e286470710061547r373b431du7e77b51fc2083614@mail.gmail.com>
	<4708B1E6.4090401@libero.it>
Message-ID: <88e286470710070404i63e47c99wa20ad135a2e364af@mail.gmail.com>

On 07/10/2007, Manlio Perillo <manlio_perillo at libero.it> wrote:
> Graham Dumpleton ha scritto:
> > On 07/10/2007, Manlio Perillo <manlio_perillo at libero.it> wrote:
> >> Phillip J. Eby ha scritto:
> >>> At 11:04 AM 10/6/2007 +0200, Manlio Perillo wrote:
> >>>> As an example, the WSGI write callable cannot be implemented in a
> >>>> conforming way in Nginx.
> >>> ...unless you use a separate thread or process.
> >>>
> >> I'm insisting about asynchronous support in WSGI because
> >> Nginx *does not supports threads*.
> >>
> >> It has some thread support but it is *broken*.
> >> Even if in future the problems are solved, the threading model of Nginx
> >> *will break* many existing WSGI applications, since the WSGI iteration
> >> can be resumed in different threads.
> >>
> >> Of course, a WSGI application can use threads, but i think that it
> >> *needs* a wsgi.pause_output extension, for synchronization.
> >
> > I appreciate that you can't use the thread support in nginx, but what
> > I don't understand is why you can't utililise Python threading API (or
> > even POSIX threads) at the boundary between nginx and the interface
> > into the WSGI application, ie., in the WSGI adapter layer, so as to
> > emulate a synchronous style WSGI interface on top of the nginx event
> > driven layer.
>
> This is possible, but I think that it is better to offer a basic
> asynchronous support in WSGI, since in this way it is possible to build
> threading support in pure Python *and*, more important, this support is
> reusable by other implementations.
>
> > In other words you hide all the complexity of any queues
> > or other synchronisation mechanisms for communicating any data between
> > the two within the adapter. This way you do not need to expose an
> > asynchronous API to the WSGI application itself and existing WSGI code
> > can be used as is.
> >
>
> The Python threading support can be implemented as a "middleware", so it
> is trasparent to the WSGI application.
>
> Not sure if it can be called "middleware", however.

If providing support for synchronous WSGI by using an adapter is how
you would support that, then I think all your problems would be solved
very easily by not trying to push that asynchronous support be added
to WSGI itself. Instead, come up with your own independent
asynchronous Python API for nginx and call it something completely
different and not try and get it labeled as being WSGI in some way.

In other words, don't call your nginx module mod_wsgi but mod_pynginx
for example. Having done that, then offer as a separate package a
synchronous WSGI adapter for your mod_pynginx and clearly state that
although your module doesn't support WSGI directly, it does via the
separate WSGI adapter.

The reason you are getting so much push back here on this list is
because you are trying to turn WSGI in to something it isn't when
there isn't a need to as you could still provide support for the
current WSGI specification as is by taking the adapter approach
instead.

What you would end up with is not much different to how Apache
mod_python has a number of WSGI adapters available for it. In some
respects it would probably be more attractive to people for you to
provide a Python API for using nginx which better matches how nginx
works and allows the most performance to be gotten out of nginx for
Python applications, without binding yourself to WSGI. That way, if
people choose to work with your lower level API then they could and
write applications specifically for nginx in much the same way that
people write applications specifically for Apache using mod_python.

So, don't try and force your API to be WSGI, and at the same time
don't try and force the WSGI specification to change so you can call
what you are developing WSGI. Doing either is possibly only going to
limit the extent to which you could develop your nginx specific Python
API. You would be much better doing your API however you want, call it
something different, but then provide a WSGI adapter for those want to
run WSGI applications on top of it.

Graham

From ianb at colorstudy.com  Mon Oct  8 02:37:05 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Sun, 07 Oct 2007 19:37:05 -0500
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <4707AE58.3040003@libero.it>
References: <4704222D.30208@colorstudy.com>
	<4704EEDA.1010800@libero.it>	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>	<4704FD32.9020604@libero.it>	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>	<47050CC7.9030500@libero.it>	<20071004161810.060183A407A@sparrow.telecommunity.com>	<bcf87d920710050934i21467943m9c578cfde404b3c@mail.gmail.com>	<20071005173253.86D293A407B@sparrow.telecommunity.com>	<bcf87d920710051413p1aed82c7kc1886e84aaa55f85@mail.gmail.com>	<20071005220455.1ABB23A407B@sparrow.telecommunity.com>	<47074F97.8040604@libero.it>	<20071006143337.D0BFF3A407A@sparrow.telecommunity.com>
	<4707AE58.3040003@libero.it>
Message-ID: <47097BB1.502@colorstudy.com>

Manlio Perillo wrote:
> Phillip J. Eby ha scritto:
>> At 11:04 AM 10/6/2007 +0200, Manlio Perillo wrote:
>>> As an example, the WSGI write callable cannot be implemented in a
>>> conforming way in Nginx.
>> ...unless you use a separate thread or process.
>>
> 
> I'm insisting about asynchronous support in WSGI because
> Nginx *does not supports threads*.
> 
> It has some thread support but it is *broken*.
> Even if in future the problems are solved, the threading model of Nginx 
> *will break* many existing WSGI applications, since the WSGI iteration 
> can be resumed in different threads.

Just so you are aware -- almost all current WSGI applications block, and 
can't be run in asynchronous environments.  So if you are writing WSGI 
support that doesn't support applications that block, well, it won't 
really be able to do much with current WSGI code.


-- 
Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org
             : Write code, do good : http://topp.openplans.org/careers

From manlio_perillo at libero.it  Mon Oct  8 13:02:27 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Mon, 08 Oct 2007 13:02:27 +0200
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <47097BB1.502@colorstudy.com>
References: <4704222D.30208@colorstudy.com>
	<4704EEDA.1010800@libero.it>	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>	<4704FD32.9020604@libero.it>	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>	<47050CC7.9030500@libero.it>	<20071004161810.060183A407A@sparrow.telecommunity.com>	<bcf87d920710050934i21467943m9c578cfde404b3c@mail.gmail.com>	<20071005173253.86D293A407B@sparrow.telecommunity.com>	<bcf87d920710051413p1aed82c7kc1886e84aaa55f85@mail.gmail.com>	<20071005220455.1ABB23A407B@sparrow.telecommunity.com>	<47074F97.8040604@libero.it>	<20071006143337.D0BFF3A407A@sparrow.telecommunity.com>
	<4707AE58.3040003@libero.it> <47097BB1.502@colorstudy.com>
Message-ID: <470A0E43.3040806@libero.it>

Ian Bicking ha scritto:
> Manlio Perillo wrote:
>> Phillip J. Eby ha scritto:
>>> At 11:04 AM 10/6/2007 +0200, Manlio Perillo wrote:
>>>> As an example, the WSGI write callable cannot be implemented in a
>>>> conforming way in Nginx.
>>> ...unless you use a separate thread or process.
>>>
>>
>> I'm insisting about asynchronous support in WSGI because
>> Nginx *does not supports threads*.
>>
>> It has some thread support but it is *broken*.
>> Even if in future the problems are solved, the threading model of 
>> Nginx *will break* many existing WSGI applications, since the WSGI 
>> iteration can be resumed in different threads.
> 
> Just so you are aware -- almost all current WSGI applications block, and 
> can't be run in asynchronous environments.  

Not every WSGI application "blocks" the request processing for a 
"sensible" amount of time.

A streaming application, as an example, can "block" without problems, 
since nginx mod_wsgi will pause the execution as soon as the application 
output cannot be written at once to the client.

Moreover, as I have already written, using the wsgi.pause_output, it 
should possible to write a WSGI "component" that run the entire WSGI 
application in a separate thread (but, in this case, it MUST buffer the 
entire output, since nginx is not thread safe).

Nginx can also use several worker processes, so it can still (somehow) 
serve "blocking" applications.

> So if you are writing WSGI 
> support that doesn't support applications that block, well, it won't 
> really be able to do much with current WSGI code.
> 

Supporting "legacy" and "huge" WSGI applications is not really a 
priority for me.

I want some support for adding extensions that can be used by other WSGI 
implementations that want to support asynchronous applications in 
asynchronous server.

I can add "proprietary" extensions, but Python is already full of not 
interoperable web solutions.


P.S.
Since, as I can see, many people on this mailing list are not interested 
in asynchronous support for WSGI, we can stop this thread (and further 
discussions) here.


Regards  Manlio Perillo

From pje at telecommunity.com  Mon Oct  8 13:17:11 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 08 Oct 2007 07:17:11 -0400
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <470A0E43.3040806@libero.it>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<bcf87d920710050934i21467943m9c578cfde404b3c@mail.gmail.com>
	<20071005173253.86D293A407B@sparrow.telecommunity.com>
	<bcf87d920710051413p1aed82c7kc1886e84aaa55f85@mail.gmail.com>
	<20071005220455.1ABB23A407B@sparrow.telecommunity.com>
	<47074F97.8040604@libero.it>
	<20071006143337.D0BFF3A407A@sparrow.telecommunity.com>
	<4707AE58.3040003@libero.it> <47097BB1.502@colorstudy.com>
	<470A0E43.3040806@libero.it>
Message-ID: <20071008111827.32CD83A407C@sparrow.telecommunity.com>

At 01:02 PM 10/8/2007 +0200, Manlio Perillo wrote:
>Supporting "legacy" and "huge" WSGI applications is not really a
>priority for me.

Then you should really make it clear to your users that your Nginx 
module does not support WSGI.  The entire point of WSGI is to allow 
"legacy" (i.e. already-written applications) to be portable across 
servers.  Something that doesn't run existing WSGI apps is clearly not WSGI.


From manlio_perillo at libero.it  Mon Oct  8 13:48:55 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Mon, 08 Oct 2007 13:48:55 +0200
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <20071008111827.32CD83A407C@sparrow.telecommunity.com>
References: <4704222D.30208@colorstudy.com> <4704EEDA.1010800@libero.it>
	<20071004142648.D2AFB3A407A@sparrow.telecommunity.com>
	<4704FD32.9020604@libero.it>
	<20071004153734.1DFA33A407A@sparrow.telecommunity.com>
	<47050CC7.9030500@libero.it>
	<20071004161810.060183A407A@sparrow.telecommunity.com>
	<bcf87d920710050934i21467943m9c578cfde404b3c@mail.gmail.com>
	<20071005173253.86D293A407B@sparrow.telecommunity.com>
	<bcf87d920710051413p1aed82c7kc1886e84aaa55f85@mail.gmail.com>
	<20071005220455.1ABB23A407B@sparrow.telecommunity.com>
	<47074F97.8040604@libero.it>
	<20071006143337.D0BFF3A407A@sparrow.telecommunity.com>
	<4707AE58.3040003@libero.it> <47097BB1.502@colorstudy.com>
	<470A0E43.3040806@libero.it>
	<20071008111827.32CD83A407C@sparrow.telecommunity.com>
Message-ID: <470A1927.5080403@libero.it>

Phillip J. Eby ha scritto:
> At 01:02 PM 10/8/2007 +0200, Manlio Perillo wrote:
>> Supporting "legacy" and "huge" WSGI applications is not really a
>> priority for me.
> 
> Then you should really make it clear to your users that your Nginx 
> module does not support WSGI.  The entire point of WSGI is to allow 
> "legacy" (i.e. already-written applications) to be portable across 
> servers.  Something that doesn't run existing WSGI apps is clearly not 
> WSGI.
> 

[Here I respond to the latest post of Graham, too.]

Right, but actually nginx mod_wsgi *can* execute every WSGI application 
in a *conforming* way (I'm completing full support for WSGI 2.0, and 
after this I will implement WSGI 1.0).

Of course some classes of WSGI applications runs *better* if they don't 
block the nginx process loop too much, so that nginx can serve multiple 
requests at the same time.

It is simply a matter of optimized execution.


Regards  Manlio Perillo

From graham.dumpleton at gmail.com  Mon Oct  8 13:53:44 2007
From: graham.dumpleton at gmail.com (Graham Dumpleton)
Date: Mon, 8 Oct 2007 21:53:44 +1000
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <470A1927.5080403@libero.it>
References: <4704222D.30208@colorstudy.com>
	<bcf87d920710051413p1aed82c7kc1886e84aaa55f85@mail.gmail.com>
	<20071005220455.1ABB23A407B@sparrow.telecommunity.com>
	<47074F97.8040604@libero.it>
	<20071006143337.D0BFF3A407A@sparrow.telecommunity.com>
	<4707AE58.3040003@libero.it> <47097BB1.502@colorstudy.com>
	<470A0E43.3040806@libero.it>
	<20071008111827.32CD83A407C@sparrow.telecommunity.com>
	<470A1927.5080403@libero.it>
Message-ID: <88e286470710080453k619c0a83kfa67c3bde986a67@mail.gmail.com>

On 08/10/2007, Manlio Perillo <manlio_perillo at libero.it> wrote:
> Phillip J. Eby ha scritto:
> > At 01:02 PM 10/8/2007 +0200, Manlio Perillo wrote:
> >> Supporting "legacy" and "huge" WSGI applications is not really a
> >> priority for me.
> >
> > Then you should really make it clear to your users that your Nginx
> > module does not support WSGI.  The entire point of WSGI is to allow
> > "legacy" (i.e. already-written applications) to be portable across
> > servers.  Something that doesn't run existing WSGI apps is clearly not
> > WSGI.
> >
>
> [Here I respond to the latest post of Graham, too.]
>
> Right, but actually nginx mod_wsgi *can* execute every WSGI application
> in a *conforming* way (I'm completing full support for WSGI 2.0, and
> after this I will implement WSGI 1.0).
>
> Of course some classes of WSGI applications runs *better* if they don't
> block the nginx process loop too much, so that nginx can serve multiple
> requests at the same time.
>
> It is simply a matter of optimized execution.

Do note that there only exists WSGI 1.0. There is no such thing as
WSGI 2.0 as yet and you shouldn't really assume that the list of
proposed ideas for discussion will actually end up producing anything
that looks like what is described. All you can really do at present is
implement WSGI 1.0, anything else is not WSGI and certainly not WSGI
2.0.

Graham

From manlio_perillo at libero.it  Mon Oct  8 13:57:59 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Mon, 08 Oct 2007 13:57:59 +0200
Subject: [Web-SIG] WSGI 2.0
In-Reply-To: <88e286470710080453k619c0a83kfa67c3bde986a67@mail.gmail.com>
References: <4704222D.30208@colorstudy.com>	
	<bcf87d920710051413p1aed82c7kc1886e84aaa55f85@mail.gmail.com>	
	<20071005220455.1ABB23A407B@sparrow.telecommunity.com>	
	<47074F97.8040604@libero.it>	
	<20071006143337.D0BFF3A407A@sparrow.telecommunity.com>	
	<4707AE58.3040003@libero.it> <47097BB1.502@colorstudy.com>	
	<470A0E43.3040806@libero.it>	
	<20071008111827.32CD83A407C@sparrow.telecommunity.com>	
	<470A1927.5080403@libero.it>
	<88e286470710080453k619c0a83kfa67c3bde986a67@mail.gmail.com>
Message-ID: <470A1B47.7090908@libero.it>

Graham Dumpleton ha scritto:
> On 08/10/2007, Manlio Perillo <manlio_perillo at libero.it> wrote:
>> Phillip J. Eby ha scritto:
>>> At 01:02 PM 10/8/2007 +0200, Manlio Perillo wrote:
>>>> Supporting "legacy" and "huge" WSGI applications is not really a
>>>> priority for me.
>>> Then you should really make it clear to your users that your Nginx
>>> module does not support WSGI.  The entire point of WSGI is to allow
>>> "legacy" (i.e. already-written applications) to be portable across
>>> servers.  Something that doesn't run existing WSGI apps is clearly not
>>> WSGI.
>>>
>> [Here I respond to the latest post of Graham, too.]
>>
>> Right, but actually nginx mod_wsgi *can* execute every WSGI application
>> in a *conforming* way (I'm completing full support for WSGI 2.0, and
>> after this I will implement WSGI 1.0).
>>
>> Of course some classes of WSGI applications runs *better* if they don't
>> block the nginx process loop too much, so that nginx can serve multiple
>> requests at the same time.
>>
>> It is simply a matter of optimized execution.
> 
> Do note that there only exists WSGI 1.0. There is no such thing as
> WSGI 2.0 as yet and you shouldn't really assume that the list of
> proposed ideas for discussion will actually end up producing anything
> that looks like what is described. All you can really do at present is
> implement WSGI 1.0, anything else is not WSGI and certainly not WSGI
> 2.0.
> 

Right, and in the nginx mod_wsgi README I explicitly write that the 
current version is implementing the WSGI *draft*.

The reason I'm implementing the WSGI 2.0 draft is that it allows a more 
simple code flow.

> Graham
> 


Regards  Manlio Perillo

From manlio_perillo at libero.it  Mon Oct  8 18:25:00 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Mon, 08 Oct 2007 18:25:00 +0200
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <20071003175813.7DCEA3A407A@sparrow.telecommunity.com>
References: <4703ADE1.5040507@libero.it>
	<20071003165020.23FAA3A407A@sparrow.telecommunity.com>
	<4703CB72.6080308@libero.it>
	<20071003175813.7DCEA3A407A@sparrow.telecommunity.com>
Message-ID: <470A59DC.1060905@libero.it>

Phillip J. Eby ha scritto:
> [...]
> 
> I don't think there's any point to having a WSGI extension for If-* 
> header support.  

I have just found that the WSGI spec says:
"""...it should be clear that a server may handle cache validation via 
the If-None-Match and If-Modified-Since request headers and the 
Last-Modified and ETag response headers."""


So a WSGI implementation is *allowed* to perform cache validation, but 
it is not clear *how* this should be done.

As an example, without the need of an extension, the start_response 
callable may check if Last-Modified or ETag is in the headers.
In this case, it may perform a cache validation, and if the client 
representation is fresh, it may omit to send the body.

However there are two problems here:
1) It is not clear if WSGI explicitly allows an implementation to skip
    the iteration over the app_iter object, for optimization purpose
2) For a WSGI implementation embedded in an existing webserver, the
    most convenient method to perform cache validation is to let the
    server do it; however this requires to send the headers as soon as
    start_response is called, and this is not allowed.


Regards  Manlio Perillo

From t.broyer at gmail.com  Mon Oct  8 19:49:06 2007
From: t.broyer at gmail.com (Thomas Broyer)
Date: Mon, 8 Oct 2007 19:49:06 +0200
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <470A59DC.1060905@libero.it>
References: <4703ADE1.5040507@libero.it>
	<20071003165020.23FAA3A407A@sparrow.telecommunity.com>
	<4703CB72.6080308@libero.it>
	<20071003175813.7DCEA3A407A@sparrow.telecommunity.com>
	<470A59DC.1060905@libero.it>
Message-ID: <a9699fd20710081049o5d461d90p10bdec34c049fa82@mail.gmail.com>

2007/10/8, Manlio Perillo:
> Phillip J. Eby ha scritto:
> > [...]
> >
> > I don't think there's any point to having a WSGI extension for If-*
> > header support.
>
> I have just found that the WSGI spec says:
> """...it should be clear that a server may handle cache validation via
> the If-None-Match and If-Modified-Since request headers and the
> Last-Modified and ETag response headers."""
>
>
> So a WSGI implementation is *allowed* to perform cache validation, but
> it is not clear *how* this should be done.
>
> As an example, without the need of an extension, the start_response
> callable may check if Last-Modified or ETag is in the headers.
> In this case, it may perform a cache validation, and if the client
> representation is fresh, it may omit to send the body.
>
> However there are two problems here:
> 1) It is not clear if WSGI explicitly allows an implementation to skip
>    the iteration over the app_iter object, for optimization purpose
> 2) For a WSGI implementation embedded in an existing webserver, the
>    most convenient method to perform cache validation is to let the
>    server do it; however this requires to send the headers as soon as
>    start_response is called, and this is not allowed.

How about (not tested, and simplified to require the app to return an
iterable, and without support for If-Range):

def has_precondition(environ):
     return "HTTP_IF_MATCH" in environ or
            "HTTP_IF_NONE_MATCH" in environ or
            "HTTP_IF_MODIFIED_SINCE" in environ or
            "HTTP_IF_UNMODIFIED_SINCE" in environ

def matches_preconditions(environ, headers):
    # TODO

def notmodifed_middleware(application):
    def middleware(environ, start_response):
        notmodified = [False]
        def sr(status, headers, exc_info=None):
            if status[0] == "2" and matches_preconditions(environ, headers):
                start_response("304 Not Modified", headers, exc_info)
                notmodified[0] = True
                return lambda s: raise NotSupportedError("The write
callback is deprecated")
            else:
               notmodified[0] = False
               return start_response(status, headers, exc_info)
        app_iter = application(environ,
            environ["wsgi.method"] == "GET" and
has_preconditions(environ) and sr or start_response)
        if notmodified[0]:
            return ("", )
        else:
            return app_iter
    return middleware


We're still waiting for the app to complete (and return its app_iter)
before sending anything to the client but this doesn't prevent us from
checking preconditions and in this case replace the status with a 304
Not Modified and an empty body (ignoring the app_iter all together;
but maybe we should iterate it to allow the wrapped application to
*really* complete its execution)

-- 
Thomas Broyer

From t.broyer at gmail.com  Mon Oct  8 19:51:14 2007
From: t.broyer at gmail.com (Thomas Broyer)
Date: Mon, 8 Oct 2007 19:51:14 +0200
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <470A59DC.1060905@libero.it>
References: <4703ADE1.5040507@libero.it>
	<20071003165020.23FAA3A407A@sparrow.telecommunity.com>
	<4703CB72.6080308@libero.it>
	<20071003175813.7DCEA3A407A@sparrow.telecommunity.com>
	<470A59DC.1060905@libero.it>
Message-ID: <a9699fd20710081051w59152902ldde229a14fe02b6@mail.gmail.com>

2007/10/8, Manlio Perillo:
> However there are two problems here:
> 1) It is not clear if WSGI explicitly allows an implementation to skip
>    the iteration over the app_iter object, for optimization purpose
> 2) For a WSGI implementation embedded in an existing webserver, the
>    most convenient method to perform cache validation is to let the
>    server do it; however this requires to send the headers as soon as
>    start_response is called, and this is not allowed.

Oops, sorry, hadn't correctly understood what you were saying. Of
course you're right here.

-- 
Thomas Broyer

From manlio_perillo at libero.it  Mon Oct  8 20:19:57 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Mon, 08 Oct 2007 20:19:57 +0200
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <a9699fd20710081051w59152902ldde229a14fe02b6@mail.gmail.com>
References: <4703ADE1.5040507@libero.it>	<20071003165020.23FAA3A407A@sparrow.telecommunity.com>	<4703CB72.6080308@libero.it>	<20071003175813.7DCEA3A407A@sparrow.telecommunity.com>	<470A59DC.1060905@libero.it>
	<a9699fd20710081051w59152902ldde229a14fe02b6@mail.gmail.com>
Message-ID: <470A74CD.3090602@libero.it>

Thomas Broyer ha scritto:
> 2007/10/8, Manlio Perillo:
>> However there are two problems here:
>> 1) It is not clear if WSGI explicitly allows an implementation to skip
>>    the iteration over the app_iter object, for optimization purpose
>> 2) For a WSGI implementation embedded in an existing webserver, the
>>    most convenient method to perform cache validation is to let the
>>    server do it; however this requires to send the headers as soon as
>>    start_response is called, and this is not allowed.
> 
> Oops, sorry, hadn't correctly understood what you were saying. Of
> course you're right here.
> 

A precisation: this is only an optimization.

Nginx will always do the cache validation (if the appropriate header 
filter is enabled) and will discard the body if the cliend has a fresh copy.

The same applies to If-Range, but in this case it is not possible to 
optimize the WSGI application execution.


Regards  Manlio Perillo

From pje at telecommunity.com  Mon Oct  8 21:32:48 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 08 Oct 2007 15:32:48 -0400
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <470A59DC.1060905@libero.it>
References: <4703ADE1.5040507@libero.it>
	<20071003165020.23FAA3A407A@sparrow.telecommunity.com>
	<4703CB72.6080308@libero.it>
	<20071003175813.7DCEA3A407A@sparrow.telecommunity.com>
	<470A59DC.1060905@libero.it>
Message-ID: <20071008193012.4213D3A407A@sparrow.telecommunity.com>

At 06:25 PM 10/8/2007 +0200, Manlio Perillo wrote:
>Phillip J. Eby ha scritto:
> > [...]
> >
> > I don't think there's any point to having a WSGI extension for If-*
> > header support.
>
>I have just found that the WSGI spec says:
>"""...it should be clear that a server may handle cache validation via
>the If-None-Match and If-Modified-Since request headers and the
>Last-Modified and ETag response headers."""
>
>
>So a WSGI implementation is *allowed* to perform cache validation, but
>it is not clear *how* this should be done.
>
>As an example, without the need of an extension, the start_response
>callable may check if Last-Modified or ETag is in the headers.
>In this case, it may perform a cache validation, and if the client
>representation is fresh, it may omit to send the body.
>
>However there are two problems here:
>1) It is not clear if WSGI explicitly allows an implementation to skip
>     the iteration over the app_iter object, for optimization purpose
>2) For a WSGI implementation embedded in an existing webserver, the
>     most convenient method to perform cache validation is to let the
>     server do it; however this requires to send the headers as soon as
>     start_response is called, and this is not allowed.

The only time that the headers can be changed is if there is an error 
during the generation of the body content.  So, the fact that 
send_headers() is called with a matching ETag or Last-Modified, is 
sufficient to determine that the request may be handled using a cache.

You are correct that the PEP does not explicitly allow the iteration 
to be skipped.  My thought is that it should indeed allow it, as long 
as the close() method (if any) is still called, and so long as the 
request method was a GET.

With that clarification added to the existing spec, I think it should 
be possible to implement cache validation in a server.

Hopefully, if anybody knows of a reason why this clarification should 
*not* be added to the spec, they will speak up now.  :)


From graham.dumpleton at gmail.com  Tue Oct  9 00:23:53 2007
From: graham.dumpleton at gmail.com (Graham Dumpleton)
Date: Tue, 9 Oct 2007 08:23:53 +1000
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <20071008193012.4213D3A407A@sparrow.telecommunity.com>
References: <4703ADE1.5040507@libero.it>
	<20071003165020.23FAA3A407A@sparrow.telecommunity.com>
	<4703CB72.6080308@libero.it>
	<20071003175813.7DCEA3A407A@sparrow.telecommunity.com>
	<470A59DC.1060905@libero.it>
	<20071008193012.4213D3A407A@sparrow.telecommunity.com>
Message-ID: <88e286470710081523q2b245976w7220d7075fb19d6c@mail.gmail.com>

On 09/10/2007, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 06:25 PM 10/8/2007 +0200, Manlio Perillo wrote:
> >Phillip J. Eby ha scritto:
> > > [...]
> > >
> > > I don't think there's any point to having a WSGI extension for If-*
> > > header support.
> >
> >I have just found that the WSGI spec says:
> >"""...it should be clear that a server may handle cache validation via
> >the If-None-Match and If-Modified-Since request headers and the
> >Last-Modified and ETag response headers."""
> >
> >
> >So a WSGI implementation is *allowed* to perform cache validation, but
> >it is not clear *how* this should be done.
> >
> >As an example, without the need of an extension, the start_response
> >callable may check if Last-Modified or ETag is in the headers.
> >In this case, it may perform a cache validation, and if the client
> >representation is fresh, it may omit to send the body.
> >
> >However there are two problems here:
> >1) It is not clear if WSGI explicitly allows an implementation to skip
> >     the iteration over the app_iter object, for optimization purpose
> >2) For a WSGI implementation embedded in an existing webserver, the
> >     most convenient method to perform cache validation is to let the
> >     server do it; however this requires to send the headers as soon as
> >     start_response is called, and this is not allowed.
>
> The only time that the headers can be changed is if there is an error
> during the generation of the body content.  So, the fact that
> send_headers() is called with a matching ETag or Last-Modified, is
> sufficient to determine that the request may be handled using a cache.
>
> You are correct that the PEP does not explicitly allow the iteration
> to be skipped.  My thought is that it should indeed allow it, as long
> as the close() method (if any) is still called, and so long as the
> request method was a GET.

Why only a GET?

Just showing my ignorance here and would like it explained. :-)

Graham

> With that clarification added to the existing spec, I think it should
> be possible to implement cache validation in a server.
>
> Hopefully, if anybody knows of a reason why this clarification should
> *not* be added to the spec, they will speak up now.  :)

From pje at telecommunity.com  Tue Oct  9 03:10:50 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 08 Oct 2007 21:10:50 -0400
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <88e286470710081523q2b245976w7220d7075fb19d6c@mail.gmail.co
 m>
References: <4703ADE1.5040507@libero.it>
	<20071003165020.23FAA3A407A@sparrow.telecommunity.com>
	<4703CB72.6080308@libero.it>
	<20071003175813.7DCEA3A407A@sparrow.telecommunity.com>
	<470A59DC.1060905@libero.it>
	<20071008193012.4213D3A407A@sparrow.telecommunity.com>
	<88e286470710081523q2b245976w7220d7075fb19d6c@mail.gmail.com>
Message-ID: <20071009011039.249713A40BF@sparrow.telecommunity.com>

At 08:23 AM 10/9/2007 +1000, Graham Dumpleton wrote:
>On 09/10/2007, Phillip J. Eby <pje at telecommunity.com> wrote:
> > At 06:25 PM 10/8/2007 +0200, Manlio Perillo wrote:
> > >Phillip J. Eby ha scritto:
> > > > [...]
> > > >
> > > > I don't think there's any point to having a WSGI extension for If-*
> > > > header support.
> > >
> > >I have just found that the WSGI spec says:
> > >"""...it should be clear that a server may handle cache validation via
> > >the If-None-Match and If-Modified-Since request headers and the
> > >Last-Modified and ETag response headers."""
> > >
> > >
> > >So a WSGI implementation is *allowed* to perform cache validation, but
> > >it is not clear *how* this should be done.
> > >
> > >As an example, without the need of an extension, the start_response
> > >callable may check if Last-Modified or ETag is in the headers.
> > >In this case, it may perform a cache validation, and if the client
> > >representation is fresh, it may omit to send the body.
> > >
> > >However there are two problems here:
> > >1) It is not clear if WSGI explicitly allows an implementation to skip
> > >     the iteration over the app_iter object, for optimization purpose
> > >2) For a WSGI implementation embedded in an existing webserver, the
> > >     most convenient method to perform cache validation is to let the
> > >     server do it; however this requires to send the headers as soon as
> > >     start_response is called, and this is not allowed.
> >
> > The only time that the headers can be changed is if there is an error
> > during the generation of the body content.  So, the fact that
> > send_headers() is called with a matching ETag or Last-Modified, is
> > sufficient to determine that the request may be handled using a cache.
> >
> > You are correct that the PEP does not explicitly allow the iteration
> > to be skipped.  My thought is that it should indeed allow it, as long
> > as the close() method (if any) is still called, and so long as the
> > request method was a GET.
>
>Why only a GET?
>
>Just showing my ignorance here and would like it explained. :-)

Since GET is supposed to be side effect-free, skipping the 
calculation of the response body (by not iterating over it) is less 
likely to cause a problem than with another request method.  I guess 
HEAD would be safe, too.

If we were just now defining WSGI 1.0, I would let it be any method 
and explicitly document that servers doing cache validation or 
processing a HEAD method can skip iteration of the body, so that you 
can get better performance.

However, if we put this language into WSGI 1.0, I'm wary of breaking 
stuff that exists in the field; indeed it might be better just to say 
that it's up to the user to add middleware to do this, rather than 
trying to get a common standard for how servers should handle it.


From graham.dumpleton at gmail.com  Tue Oct  9 03:19:42 2007
From: graham.dumpleton at gmail.com (Graham Dumpleton)
Date: Tue, 9 Oct 2007 11:19:42 +1000
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <20071009011039.249713A40BF@sparrow.telecommunity.com>
References: <4703ADE1.5040507@libero.it>
	<20071003165020.23FAA3A407A@sparrow.telecommunity.com>
	<4703CB72.6080308@libero.it>
	<20071003175813.7DCEA3A407A@sparrow.telecommunity.com>
	<470A59DC.1060905@libero.it>
	<20071008193012.4213D3A407A@sparrow.telecommunity.com>
	<88e286470710081523q2b245976w7220d7075fb19d6c@mail.gmail.com>
	<20071009011039.249713A40BF@sparrow.telecommunity.com>
Message-ID: <88e286470710081819g10f558d9k3ae6683ccfe30d85@mail.gmail.com>

On 09/10/2007, Phillip J. Eby <pje at telecommunity.com> wrote:
> At 08:23 AM 10/9/2007 +1000, Graham Dumpleton wrote:
> >On 09/10/2007, Phillip J. Eby <pje at telecommunity.com> wrote:
> > > At 06:25 PM 10/8/2007 +0200, Manlio Perillo wrote:
> > > >Phillip J. Eby ha scritto:
> > > > > [...]
> > > > >
> > > > > I don't think there's any point to having a WSGI extension for If-*
> > > > > header support.
> > > >
> > > >I have just found that the WSGI spec says:
> > > >"""...it should be clear that a server may handle cache validation via
> > > >the If-None-Match and If-Modified-Since request headers and the
> > > >Last-Modified and ETag response headers."""
> > > >
> > > >
> > > >So a WSGI implementation is *allowed* to perform cache validation, but
> > > >it is not clear *how* this should be done.
> > > >
> > > >As an example, without the need of an extension, the start_response
> > > >callable may check if Last-Modified or ETag is in the headers.
> > > >In this case, it may perform a cache validation, and if the client
> > > >representation is fresh, it may omit to send the body.
> > > >
> > > >However there are two problems here:
> > > >1) It is not clear if WSGI explicitly allows an implementation to skip
> > > >     the iteration over the app_iter object, for optimization purpose
> > > >2) For a WSGI implementation embedded in an existing webserver, the
> > > >     most convenient method to perform cache validation is to let the
> > > >     server do it; however this requires to send the headers as soon as
> > > >     start_response is called, and this is not allowed.
> > >
> > > The only time that the headers can be changed is if there is an error
> > > during the generation of the body content.  So, the fact that
> > > send_headers() is called with a matching ETag or Last-Modified, is
> > > sufficient to determine that the request may be handled using a cache.
> > >
> > > You are correct that the PEP does not explicitly allow the iteration
> > > to be skipped.  My thought is that it should indeed allow it, as long
> > > as the close() method (if any) is still called, and so long as the
> > > request method was a GET.
> >
> >Why only a GET?
> >
> >Just showing my ignorance here and would like it explained. :-)
>
> Since GET is supposed to be side effect-free, skipping the
> calculation of the response body (by not iterating over it) is less
> likely to cause a problem than with another request method.  I guess
> HEAD would be safe, too.

Except that with the way that people use query strings to a GET
instead of a POST with form data in the body, that GET can technically
also have a content body, and how people in general abuse the method
type, that probably often isn't the case. This is why I was querying
the distinction, as not sure one can really say it is different to
other methods unless HTTP specifications do indicate as much at least
in relation to caching. Caching is an area I have never really looked,
so I don't really know what the specifications say so this could all
be irrelevant. :-)

Graham

> If we were just now defining WSGI 1.0, I would let it be any method
> and explicitly document that servers doing cache validation or
> processing a HEAD method can skip iteration of the body, so that you
> can get better performance.
>
> However, if we put this language into WSGI 1.0, I'm wary of breaking
> stuff that exists in the field; indeed it might be better just to say
> that it's up to the user to add middleware to do this, rather than
> trying to get a common standard for how servers should handle it.
>
>

From t.broyer at gmail.com  Tue Oct  9 09:05:01 2007
From: t.broyer at gmail.com (Thomas Broyer)
Date: Tue, 9 Oct 2007 09:05:01 +0200
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <88e286470710081819g10f558d9k3ae6683ccfe30d85@mail.gmail.com>
References: <4703ADE1.5040507@libero.it>
	<20071003165020.23FAA3A407A@sparrow.telecommunity.com>
	<4703CB72.6080308@libero.it>
	<20071003175813.7DCEA3A407A@sparrow.telecommunity.com>
	<470A59DC.1060905@libero.it>
	<20071008193012.4213D3A407A@sparrow.telecommunity.com>
	<88e286470710081523q2b245976w7220d7075fb19d6c@mail.gmail.com>
	<20071009011039.249713A40BF@sparrow.telecommunity.com>
	<88e286470710081819g10f558d9k3ae6683ccfe30d85@mail.gmail.com>
Message-ID: <a9699fd20710090005q1257d8f8ge5574c684e599814@mail.gmail.com>

2007/10/9, Graham Dumpleton <graham.dumpleton at gmail.com>:
> On 09/10/2007, Phillip J. Eby <pje at telecommunity.com> wrote:
> >
> > Since GET is supposed to be side effect-free, skipping the
> > calculation of the response body (by not iterating over it) is less
> > likely to cause a problem than with another request method.  I guess
> > HEAD would be safe, too.
>
> Except that with the way that people use query strings to a GET
> instead of a POST with form data in the body, that GET can technically
> also have a content body, and how people in general abuse the method
> type, that probably often isn't the case. This is why I was querying
> the distinction, as not sure one can really say it is different to
> other methods unless HTTP specifications do indicate as much at least
> in relation to caching. Caching is an area I have never really looked,
> so I don't really know what the specifications say so this could all
> be irrelevant. :-)

Except that in this case, they probably don't send Last-Modified or
ETag headers, or if they do, their value is probably (almost) unique
to the request.
People abusing GET probably don't care about caching, so they won't
plug or enable such middlewares. And even if they'd do, well, it's
HTTP: such a middleware isn't much different from a caching
proxy/relay.

Note also that there are less abuses of GET each day (thanks to Google
Web Accelerator pre-fetching which highlighted the problem; and Web
2.0, AJAX and ReST becoming widespread and "educating" web developers)

-- 
Thomas Broyer

From manlio_perillo at libero.it  Tue Oct  9 10:43:30 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Tue, 09 Oct 2007 10:43:30 +0200
Subject: [Web-SIG] [extension] x-wsgiorg.flush
In-Reply-To: <20071009011039.249713A40BF@sparrow.telecommunity.com>
References: <4703ADE1.5040507@libero.it>
	<20071003165020.23FAA3A407A@sparrow.telecommunity.com>
	<4703CB72.6080308@libero.it>
	<20071003175813.7DCEA3A407A@sparrow.telecommunity.com>
	<470A59DC.1060905@libero.it>
	<20071008193012.4213D3A407A@sparrow.telecommunity.com>
	<88e286470710081523q2b245976w7220d7075fb19d6c@mail.gmail.com>
	<20071009011039.249713A40BF@sparrow.telecommunity.com>
Message-ID: <470B3F32.9020005@libero.it>

Phillip J. Eby ha scritto:
> [...]
> If we were just now defining WSGI 1.0, I would let it be any method and 
> explicitly document that servers doing cache validation or processing a 
> HEAD method can skip iteration of the body, so that you can get better 
> performance.
> 
> However, if we put this language into WSGI 1.0, I'm wary of breaking 
> stuff that exists in the field; indeed it might be better just to say 
> that it's up to the user to add middleware to do this, rather than 
> trying to get a common standard for how servers should handle it.
> 

You can always publish an addendum or errata to WSGI 1.0, or just WSGI 1.1


Regards  Manlio Perillo

From manlio_perillo at libero.it  Mon Oct 15 17:52:58 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Mon, 15 Oct 2007 17:52:58 +0200
Subject: [Web-SIG] some questions about start_response implementation
Message-ID: <47138CDA.80808@libero.it>

Hi.

I'm implementing the start_response callable for Nginx mod_wsgi and I 
have a few questions.

1) From the WSGI PEP it seems that an implementation is allowed to
    *always* raise an exception when start_response is called with a not
    null exc_info.

    Is this true?

2) What happens if an application call start_response with an incorrect
    status line or headers?

    Should an implementation consider the function "called", so that an
    application can call it a second time, *without* the exc_info
    parameter?

3) How many applications/frameworks use the exc_info parameter for
    start_response?


Thanks  Manlio Perillo

From manlio_perillo at libero.it  Mon Oct 15 17:55:45 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Mon, 15 Oct 2007 17:55:45 +0200
Subject: [Web-SIG] some questions about start_response implementation
In-Reply-To: <47138CDA.80808@libero.it>
References: <47138CDA.80808@libero.it>
Message-ID: <47138D81.1090505@libero.it>

Manlio Perillo ha scritto:
> Hi.
> 
> I'm implementing the start_response callable for Nginx mod_wsgi and I 
> have a few questions.
> 
> [...]
 >
> 2) What happens if an application call start_response with an incorrect
>     status line or headers?
> 
>     Should an implementation consider the function "called", so that an
                                                       ^^^^^^
                                                     not called

>     application can call it a second time, *without* the exc_info
>     parameter?
> 


Manlio Perillo

From pje at telecommunity.com  Mon Oct 15 18:11:39 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 15 Oct 2007 12:11:39 -0400
Subject: [Web-SIG] some questions about start_response  implementation
In-Reply-To: <47138CDA.80808@libero.it>
References: <47138CDA.80808@libero.it>
Message-ID: <20071015160857.219913A40AF@sparrow.telecommunity.com>

At 05:52 PM 10/15/2007 +0200, Manlio Perillo wrote:
>Hi.
>
>I'm implementing the start_response callable for Nginx mod_wsgi and I
>have a few questions.
>
>1) From the WSGI PEP it seems that an implementation is allowed to
>     *always* raise an exception when start_response is called with a not
>     null exc_info.
>
>     Is this true?

Yes - as long as it's the exc_info passed in, i.e.:

     try:
         raise exc_info[0], exc_info[1], exc_info[2]
     finally:
         del exc_info

(this pattern of raising prevents the possibility of a reference 
cycle passing through the current stack location, keeping lots of 
objects around longer than necessary)


>2) What happens if an application call start_response with an incorrect
>     status line or headers?
>
>     Should an implementation consider the function "called", so that an
>     application can call it a second time, *without* the exc_info
>     parameter?

Interesting point.  I think it would be compliant either way, though.

(I'm skipping your third question because it doesn't matter how many 
frameworks use exc_info; if you're implementing WSGI 1.0 you have to 
support it.)


From manlio_perillo at libero.it  Mon Oct 15 18:21:01 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Mon, 15 Oct 2007 18:21:01 +0200
Subject: [Web-SIG] some questions about start_response  implementation
In-Reply-To: <20071015160857.219913A40AF@sparrow.telecommunity.com>
References: <47138CDA.80808@libero.it>
	<20071015160857.219913A40AF@sparrow.telecommunity.com>
Message-ID: <4713936D.2030001@libero.it>

Phillip J. Eby ha scritto:
> At 05:52 PM 10/15/2007 +0200, Manlio Perillo wrote:
>> Hi.
>>
>> I'm implementing the start_response callable for Nginx mod_wsgi and I
>> have a few questions.
>>
>> 1) From the WSGI PEP it seems that an implementation is allowed to
>>     *always* raise an exception when start_response is called with a not
>>     null exc_info.
>>
>>     Is this true?
> 
> Yes - as long as it's the exc_info passed in, i.e.:

It seems that WSGI *does not* requires the application to raise the 
exc_info passed.

> 
>     try:
>         raise exc_info[0], exc_info[1], exc_info[2]
>     finally:
>         del exc_info
> 
> (this pattern of raising prevents the possibility of a reference cycle 
> passing through the current stack location, keeping lots of objects 
> around longer than necessary)

Is this a concern for an implementation in C, too?

> 
> 
> 
>> 2) What happens if an application call start_response with an incorrect
>>     status line or headers?
>>
>>     Should an implementation consider the function "called", so that an
>>     application can call it a second time, *without* the exc_info
>>     parameter?
> 
> Interesting point.  I think it would be compliant either way, though.
> 
> (I'm skipping your third question because it doesn't matter how many 
> frameworks use exc_info; if you're implementing WSGI 1.0 you have to 
> support it.)
> 

Well, I'm asking because in the current implementation I always raise an 
exception, thus not allowing an application to "change its mind".

Its not a big problem to improve the code, but I can delay it if not 
really required.


Thanks and regards   Manlio Perillo

From pje at telecommunity.com  Mon Oct 15 18:45:45 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 15 Oct 2007 12:45:45 -0400
Subject: [Web-SIG] some questions about start_response   implementation
In-Reply-To: <4713936D.2030001@libero.it>
References: <47138CDA.80808@libero.it>
	<20071015160857.219913A40AF@sparrow.telecommunity.com>
	<4713936D.2030001@libero.it>
Message-ID: <20071015164302.117163A408F@sparrow.telecommunity.com>

At 06:21 PM 10/15/2007 +0200, Manlio Perillo wrote:
>Phillip J. Eby ha scritto:
> > At 05:52 PM 10/15/2007 +0200, Manlio Perillo wrote:
> >> Hi.
> >>
> >> I'm implementing the start_response callable for Nginx mod_wsgi and I
> >> have a few questions.
> >>
> >> 1) From the WSGI PEP it seems that an implementation is allowed to
> >>     *always* raise an exception when start_response is called with a not
> >>     null exc_info.
> >>
> >>     Is this true?
> >
> > Yes - as long as it's the exc_info passed in, i.e.:
>
>It seems that WSGI *does not* requires the application to raise the
>exc_info passed.

We're talking about the *server*, not the application:

"if exc_info is provided, and the HTTP headers have already been 
sent, start_response MUST raise an error, and SHOULD raise the exc_info tuple."

So, it's a "should" for the server, with the intent being that you 
should have some special reason for not doing so.  This is later 
clarified in the PEP as meaning that exception-handling middleware 
may have reasons to raise an alternative error or not raise an 
error.  However, there aren't any anticipated use cases for server 
gateways to do anything but raise the passed-in errors.


> >
> >     try:
> >         raise exc_info[0], exc_info[1], exc_info[2]
> >     finally:
> >         del exc_info
> >
> > (this pattern of raising prevents the possibility of a reference cycle
> > passing through the current stack location, keeping lots of objects
> > around longer than necessary)
>
>Is this a concern for an implementation in C, too?

No, because local variables in C don't get stored in a Python frame 
or traceback.  The above is only relevant if start_response() is 
written in Python.


>Well, I'm asking because in the current implementation I always raise an
>exception, thus not allowing an application to "change its mind".

Yeah, it's not required for an application to change its mind and 
send different non-error headers.  I don't think that such an 
application would be WSGI compliant if it did.


From manlio_perillo at libero.it  Mon Oct 15 19:04:09 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Mon, 15 Oct 2007 19:04:09 +0200
Subject: [Web-SIG] some questions about start_response   implementation
In-Reply-To: <20071015164302.117163A408F@sparrow.telecommunity.com>
References: <47138CDA.80808@libero.it>
	<20071015160857.219913A40AF@sparrow.telecommunity.com>
	<4713936D.2030001@libero.it>
	<20071015164302.117163A408F@sparrow.telecommunity.com>
Message-ID: <47139D89.1050804@libero.it>

Phillip J. Eby ha scritto:
> At 06:21 PM 10/15/2007 +0200, Manlio Perillo wrote:
>> Phillip J. Eby ha scritto:
>> > At 05:52 PM 10/15/2007 +0200, Manlio Perillo wrote:
>> >> Hi.
>> >>
>> >> I'm implementing the start_response callable for Nginx mod_wsgi and I
>> >> have a few questions.
>> >>
>> >> 1) From the WSGI PEP it seems that an implementation is allowed to
>> >>     *always* raise an exception when start_response is called with 
>> a not
>> >>     null exc_info.
>> >>
>> >>     Is this true?
>> >
>> > Yes - as long as it's the exc_info passed in, i.e.:
>>
>> It seems that WSGI *does not* requires the application to raise the
>> exc_info passed.
> 
> We're talking about the *server*, not the application:
> 

Sorry, I have written application, but I meant server :-).

> "if exc_info is provided, and the HTTP headers have already been sent, 
> start_response MUST raise an error, and SHOULD raise the exc_info tuple."
> 
> So, it's a "should" for the server, with the intent being that you 
> should have some special reason for not doing so.  This is later 
> clarified in the PEP as meaning that exception-handling middleware may 
> have reasons to raise an alternative error or not raise an error.  
> However, there aren't any anticipated use cases for server gateways to 
> do anything but raise the passed-in errors.
> 

Ok, thanks for the clarification.

 > [...]


Regards  Manlio Perillo

From manlio_perillo at libero.it  Mon Oct 15 22:06:08 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Mon, 15 Oct 2007 22:06:08 +0200
Subject: [Web-SIG] some questions about the write callable
Message-ID: <4713C830.2010709@libero.it>

Hi.

The only feature that remains to implement for nginx mod_wsgi is the 
write callable.

The WSGI spec says:
"""In other words, before write() returns, it must guarantee that the 
passed-in string was either completely sent to the client, or that it is 
buffered for transmission while the application proceeds onward."""


With Nginx it can happen that the passed-in string cannot be completely 
sent to the client, since the socket can returns an EAGAIN.

In this case Nginx will buffer the data and it will send the buffer to 
the client when the socket is ready.

This is fully supported by nginx mod_wsgi, when the application returns 
a generator, since nginx mod_wsgi will suspend the execution of the 
application until the previous buffer has been entirely written to the 
client.


Unfortunately, this is not possible with the write callable.

This means that Nginx will try to send the data to the client, *only* 
when the write function is called.

In other words, the transmission may become stalled if the application 
blocks and a previous passed-in string is in a nginx buffer.


I don't understand why WSGI explicitly says '*must not* delay', instead 
of a 'should not delay'.


There is another, more interesting, problem, however.

As far as I can understand, WSGI does not explicitly forbids an 
application to call the write callable from a separate thread.
This means that, in theory, this is allowed.

Is this true?
How many applications, if any, do this?

Since Nginx is not thread safe, this *cannot* be supported, really.


If a new WSGI 1.1 spec is going to be released, I hope that it will be 
more friendly with asynchronous servers without threads support.


Thanks  Manlio Perillo

From manlio_perillo at libero.it  Mon Oct 15 23:25:06 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Mon, 15 Oct 2007 23:25:06 +0200
Subject: [Web-SIG] some questions about the write callable
In-Reply-To: <4713C830.2010709@libero.it>
References: <4713C830.2010709@libero.it>
Message-ID: <4713DAB2.3050500@libero.it>

Manlio Perillo ha scritto:
> Hi.
> 
> The only feature that remains to implement for nginx mod_wsgi is the 
> write callable.
> 
> The WSGI spec says:
> """In other words, before write() returns, it must guarantee that the 
> passed-in string was either completely sent to the client, or that it is 
> buffered for transmission while the application proceeds onward."""
> 
> 
> With Nginx it can happen that the passed-in string cannot be completely 
> sent to the client, since the socket can returns an EAGAIN.
> 
> In this case Nginx will buffer the data and it will send the buffer to 
> the client when the socket is ready.
> 

A correction.
Nginx will not buffer the data, it will ignore successive write requests.

The buffering must be done by the application.

For the moment I will raise an exception when the data cannot be 
completely written to the client (IMHO this does not forbidden the WSGI 
spec, but, of course, it is not very useful).


Regards  Manlio Perillo

From pje at telecommunity.com  Tue Oct 16 00:27:49 2007
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon, 15 Oct 2007 18:27:49 -0400
Subject: [Web-SIG] some questions about the write callable
In-Reply-To: <4713C830.2010709@libero.it>
References: <4713C830.2010709@libero.it>
Message-ID: <20071015222512.EC0B73A408F@sparrow.telecommunity.com>

At 10:06 PM 10/15/2007 +0200, Manlio Perillo wrote:
>Hi.
>
>The only feature that remains to implement for nginx mod_wsgi is the
>write callable.
>
>The WSGI spec says:
>"""In other words, before write() returns, it must guarantee that the
>passed-in string was either completely sent to the client, or that it is
>buffered for transmission while the application proceeds onward."""
>
>
>With Nginx it can happen that the passed-in string cannot be completely
>sent to the client, since the socket can returns an EAGAIN.

In which case, your write() implementation will need to loop until 
all the data hits the OS-level buffers.


>In this case Nginx will buffer the data and it will send the buffer to
>the client when the socket is ready.

Note that the two choices are:

1. data is completely sent to the client
2. data is held in a buffer *such that transmission will continue 
while the app runs*

Buffering the data but not sending it while the application continues 
executing, is not a conformant option.


>I don't understand why WSGI explicitly says '*must not* delay', instead
>of a 'should not delay'.

Because the only reason for having write() or iteration blocks (vs 
sending a single giant string) is to support interleaving the client 
communication and some other computation, communication, or I/O.

Delay would negate the point of having the ability to stream in the 
first place.


>As far as I can understand, WSGI does not explicitly forbids an
>application to call the write callable from a separate thread.
>This means that, in theory, this is allowed.

In theory, yes.  In practice, we intended to document some 
thread-affinity restrictions, and I do not believe that anybody is 
trying to call write() from another thread.


>If a new WSGI 1.1 spec is going to be released, I hope that it will be
>more friendly with asynchronous servers without threads support.

Well, I hope that the *documentation* will be more friendly for 
implementing gateways for such servers.  It's doubtful that the 
actual execution model would change much.


From manlio_perillo at libero.it  Tue Oct 16 12:15:16 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Tue, 16 Oct 2007 12:15:16 +0200
Subject: [Web-SIG] some questions about the write callable
In-Reply-To: <20071015222512.EC0B73A408F@sparrow.telecommunity.com>
References: <4713C830.2010709@libero.it>
	<20071015222512.EC0B73A408F@sparrow.telecommunity.com>
Message-ID: <47148F34.1020509@libero.it>

Phillip J. Eby ha scritto:
> At 10:06 PM 10/15/2007 +0200, Manlio Perillo wrote:
>> Hi.
>>
>> The only feature that remains to implement for nginx mod_wsgi is the
>> write callable.
>>
>> The WSGI spec says:
>> """In other words, before write() returns, it must guarantee that the
>> passed-in string was either completely sent to the client, or that it is
>> buffered for transmission while the application proceeds onward."""
>>
>>
>> With Nginx it can happen that the passed-in string cannot be completely
>> sent to the client, since the socket can returns an EAGAIN.
> 
> In which case, your write() implementation will need to loop until all 
> the data hits the OS-level buffers.
> 

It seems that this is not possible with Nginx, but I will investigate 
this problem better, since it is the best solution.

> 
>> In this case Nginx will buffer the data and it will send the buffer to
>> the client when the socket is ready.
> 
> Note that the two choices are:
> 
> 1. data is completely sent to the client
> 2. data is held in a buffer *such that transmission will continue while 
> the app runs*
> 
> Buffering the data but not sending it while the application continues 
> executing, is not a conformant option.
> 
> 
>> I don't understand why WSGI explicitly says '*must not* delay', instead
>> of a 'should not delay'.
> 
> Because the only reason for having write() or iteration blocks (vs 
> sending a single giant string) is to support interleaving the client 
> communication and some other computation, communication, or I/O.
> 
> Delay would negate the point of having the ability to stream in the 
> first place.
> 

You are right, but this is only required by a "real" streaming 
application (one that does not have an "end").

Even if an application need to serve, as an example, a file of about 100 
MB, buffering should not be a problem (and the Nginx buffering model is 
efficient).

I'm not even sure if HTTP 1.1 allows an "infinite" stream.

> 
>> As far as I can understand, WSGI does not explicitly forbids an
>> application to call the write callable from a separate thread.
>> This means that, in theory, this is allowed.
> 
> In theory, yes.  In practice, we intended to document some 
> thread-affinity restrictions, and I do not believe that anybody is 
> trying to call write() from another thread.
> 
> 
>> If a new WSGI 1.1 spec is going to be released, I hope that it will be
>> more friendly with asynchronous servers without threads support.
> 
> Well, I hope that the *documentation* will be more friendly for 
> implementing gateways for such servers.  It's doubtful that the actual 
> execution model would change much.
> 

Ok, thanks
Manlio Perillo

From manlio_perillo at libero.it  Tue Oct 16 17:42:59 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Tue, 16 Oct 2007 17:42:59 +0200
Subject: [Web-SIG] [extension] wsgi.info
Message-ID: <4714DC03.7060003@libero.it>

Hi.

I find it strange that the WSGI environ dictionary contains no 
information about some "details" of the implementation.

I think it would be useful to have a wsgi.info variable that returns a 
tuple with two strings:
- the name of the implementation
- the version of the implementation

Example:
wsgi.info = ('nginx mod_wsgi', '0.0.4')


Manlio Perillo

From ianb at colorstudy.com  Tue Oct 16 18:10:23 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 16 Oct 2007 11:10:23 -0500
Subject: [Web-SIG] [extension] wsgi.info
In-Reply-To: <4714DC03.7060003@libero.it>
References: <4714DC03.7060003@libero.it>
Message-ID: <4714E26F.8080606@colorstudy.com>

Manlio Perillo wrote:
> Hi.
> 
> I find it strange that the WSGI environ dictionary contains no 
> information about some "details" of the implementation.
> 
> I think it would be useful to have a wsgi.info variable that returns a 
> tuple with two strings:
> - the name of the implementation
> - the version of the implementation
> 
> Example:
> wsgi.info = ('nginx mod_wsgi', '0.0.4')

The details of what implementation?  The server?  The thing that called 
the app?  The thing that called the app and the thing that called it?

OTOH, there's a SERVER_SOFTWARE CGI variable, I believe.

-- 
Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org
             : Write code, do good : http://topp.openplans.org/careers

From manlio_perillo at libero.it  Tue Oct 16 18:52:58 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Tue, 16 Oct 2007 18:52:58 +0200
Subject: [Web-SIG] [extension] wsgi.info
In-Reply-To: <4714E26F.8080606@colorstudy.com>
References: <4714DC03.7060003@libero.it> <4714E26F.8080606@colorstudy.com>
Message-ID: <4714EC6A.1040903@libero.it>

Ian Bicking ha scritto:
> Manlio Perillo wrote:
>> Hi.
>>
>> I find it strange that the WSGI environ dictionary contains no 
>> information about some "details" of the implementation.
>>
>> I think it would be useful to have a wsgi.info variable that returns a 
>> tuple with two strings:
>> - the name of the implementation
>> - the version of the implementation
>>
>> Example:
>> wsgi.info = ('nginx mod_wsgi', '0.0.4')
> 
> The details of what implementation?  The server?  The thing that called 
> the app?  

The WSGI gateway.

> The thing that called the app and the thing that called it?
> 

The former.

> OTOH, there's a SERVER_SOFTWARE CGI variable, I believe.
> 

But this refers to the  HTTP server.


Regards  Manlio Perillo

From MDiPierro at cti.depaul.edu  Wed Oct 17 06:24:35 2007
From: MDiPierro at cti.depaul.edu (Massimo Di Pierro)
Date: Tue, 16 Oct 2007 23:24:35 -0500
Subject: [Web-SIG] Gluon 1.6
Message-ID: <B81A7CD4-0148-46CC-8D8D-8B5CEE3B49D7@cti.depaul.edu>

I have a new version of Gluon out (known bugs fixed) and a video

   http://www.youtube.com/watch?v=VBjja6N6IYk

Thank you to those who expressed interest.
I would like to stress that this is a open source project released  
under GPL2 and I could really use community input to make it better  
(for example I did not have time to test it with mod_wsgi, I use  
paste httpserver). I say this since the project wikipedia page has  
been shut down, claiming this is a commercial product, which is not.

Massimo

From graham.dumpleton at gmail.com  Wed Oct 17 07:01:57 2007
From: graham.dumpleton at gmail.com (Graham Dumpleton)
Date: Wed, 17 Oct 2007 15:01:57 +1000
Subject: [Web-SIG] Gluon 1.6
In-Reply-To: <B81A7CD4-0148-46CC-8D8D-8B5CEE3B49D7@cti.depaul.edu>
References: <B81A7CD4-0148-46CC-8D8D-8B5CEE3B49D7@cti.depaul.edu>
Message-ID: <88e286470710162201g8046628h7cba1df95aee6605@mail.gmail.com>

Helps if you send a URL for the Gluon web site rather than a YouTube video.

BTW, if it is under the GPL why don't you clearly mention that on the
web site front page. I can't see a reference to GPL or even a link to
a page describing licence used on the front page. Can't seem to see
anything in the FAQ either about the licence used.

Graham

On 17/10/2007, Massimo Di Pierro <MDiPierro at cti.depaul.edu> wrote:
> I have a new version of Gluon out (known bugs fixed) and a video
>
>    http://www.youtube.com/watch?v=VBjja6N6IYk
>
> Thank you to those who expressed interest.
> I would like to stress that this is a open source project released
> under GPL2 and I could really use community input to make it better
> (for example I did not have time to test it with mod_wsgi, I use
> paste httpserver). I say this since the project wikipedia page has
> been shut down, claiming this is a commercial product, which is not.
>
> Massimo
> _______________________________________________
> Web-SIG mailing list
> Web-SIG at python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com
>

From MDiPierro at cti.depaul.edu  Wed Oct 17 16:29:35 2007
From: MDiPierro at cti.depaul.edu (Massimo Di Pierro)
Date: Wed, 17 Oct 2007 09:29:35 -0500
Subject: [Web-SIG] Gluon 1.6
In-Reply-To: <88e286470710162201g8046628h7cba1df95aee6605@mail.gmail.com>
References: <B81A7CD4-0148-46CC-8D8D-8B5CEE3B49D7@cti.depaul.edu>
	<88e286470710162201g8046628h7cba1df95aee6605@mail.gmail.com>
Message-ID: <62BCE0FA-F3CC-4638-8B68-1DD53AF189AA@cti.depaul.edu>

Good point. Just did that (the license is in the code anyway).
The url is http://mdp.cti.depaul.edu/examples
Thank you Graham.

Massimo

On Oct 17, 2007, at 12:01 AM, Graham Dumpleton wrote:

> Helps if you send a URL for the Gluon web site rather than a  
> YouTube video.
>
> BTW, if it is under the GPL why don't you clearly mention that on the
> web site front page. I can't see a reference to GPL or even a link to
> a page describing licence used on the front page. Can't seem to see
> anything in the FAQ either about the licence used.
>
> Graham
>
> On 17/10/2007, Massimo Di Pierro <MDiPierro at cti.depaul.edu> wrote:
>> I have a new version of Gluon out (known bugs fixed) and a video
>>
>>    http://www.youtube.com/watch?v=VBjja6N6IYk
>>
>> Thank you to those who expressed interest.
>> I would like to stress that this is a open source project released
>> under GPL2 and I could really use community input to make it better
>> (for example I did not have time to test it with mod_wsgi, I use
>> paste httpserver). I say this since the project wikipedia page has
>> been shut down, claiming this is a commercial product, which is not.
>>
>> Massimo
>> _______________________________________________
>> Web-SIG mailing list
>> Web-SIG at python.org
>> Web SIG: http://www.python.org/sigs/web-sig
>> Unsubscribe: http://mail.python.org/mailman/options/web-sig/ 
>> graham.dumpleton%40gmail.com
>>


From manlio_perillo at libero.it  Fri Oct 19 15:14:35 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Fri, 19 Oct 2007 15:14:35 +0200
Subject: [Web-SIG] about the status line in WSGI
Message-ID: <4718ADBB.6030804@libero.it>

Is a WSGI gateway allowed to ignore the Reason-Phrase part of the status 
line returned by the WSGI application, and to use a server defined phrase?


Thanks and regards  Manlio Perillo

From fumanchu at aminus.org  Fri Oct 19 17:14:21 2007
From: fumanchu at aminus.org (Robert Brewer)
Date: Fri, 19 Oct 2007 08:14:21 -0700
Subject: [Web-SIG] about the status line in WSGI
In-Reply-To: <4718ADBB.6030804@libero.it>
References: <4718ADBB.6030804@libero.it>
Message-ID: <F1962646D3B64642B7C9A06068EE1E64B4CB2F@ex10.hostedexchange.local>

Manlio Perillo wrote:
> Is a WSGI gateway allowed to ignore the Reason-Phrase part of the
> status line returned by the WSGI application, and to use a server
> defined phrase?

I would be sad if a WSGI gateway did that to me. Why deny a web
application developer the right to control that part of the output?


Robert Brewer
fumanchu at aminus.org

No virus found in this outgoing message.
Checked by AVG Free Edition. 
Version: 7.5.488 / Virus Database: 269.15.0/1077 - Release Date: 10/18/2007 9:54 AM
 

From manlio_perillo at libero.it  Fri Oct 19 20:42:05 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Fri, 19 Oct 2007 20:42:05 +0200
Subject: [Web-SIG] about the status line in WSGI
In-Reply-To: <F1962646D3B64642B7C9A06068EE1E64B4CB2F@ex10.hostedexchange.local>
References: <4718ADBB.6030804@libero.it>
	<F1962646D3B64642B7C9A06068EE1E64B4CB2F@ex10.hostedexchange.local>
Message-ID: <4718FA7D.1030202@libero.it>

Robert Brewer ha scritto:
> Manlio Perillo wrote:
>> Is a WSGI gateway allowed to ignore the Reason-Phrase part of the
>> status line returned by the WSGI application, and to use a server
>> defined phrase?
> 
> I would be sad if a WSGI gateway did that to me. 
> Why deny a web
> application developer the right to control that part of the output?
> 

The WSGI spec requires a full status line as a simplification for the 
WSGI Gateway and not to give more control to WSGI applications.


Regards  Manlio Perillo

From manlio_perillo at libero.it  Fri Oct 19 20:55:32 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Fri, 19 Oct 2007 20:55:32 +0200
Subject: [Web-SIG] about Py[Type]_Check in a WSGI implementation
Message-ID: <4718FDA4.9080809@libero.it>

The WSGI spec requires the response headers and sequence items to be, 
respectively, List of Tuples and Strings.

However only for the response headers it explicitly requires them to be 
a Python List, i.e type(response_headers) is ListType.

What about the other objects?

In the current implementation of WSGI for Nginx I always use 
Py[Type]_Check, and not Py[Type]_CheckExact.


Thanks and regards   Manlio Perillo

From ianb at colorstudy.com  Fri Oct 19 21:02:31 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Fri, 19 Oct 2007 14:02:31 -0500
Subject: [Web-SIG] about Py[Type]_Check in a WSGI implementation
In-Reply-To: <4718FDA4.9080809@libero.it>
References: <4718FDA4.9080809@libero.it>
Message-ID: <4718FF47.40504@colorstudy.com>

Manlio Perillo wrote:
> The WSGI spec requires the response headers and sequence items to be, 
> respectively, List of Tuples and Strings.
> 
> However only for the response headers it explicitly requires them to be 
> a Python List, i.e type(response_headers) is ListType.
> 
> What about the other objects?
> 
> In the current implementation of WSGI for Nginx I always use 
> Py[Type]_Check, and not Py[Type]_CheckExact.

All of the types are required to be exactly as defined, not subclasses 
or None.  But servers are not required to actually test this. 
wsgiref.validate does test for exactly these types, but it's acceptable 
for Nginx to just access the data without checking its exact type.

-- 
Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org


From manlio_perillo at libero.it  Fri Oct 19 21:43:39 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Fri, 19 Oct 2007 21:43:39 +0200
Subject: [Web-SIG] about Py[Type]_Check in a WSGI implementation
In-Reply-To: <4718FF47.40504@colorstudy.com>
References: <4718FDA4.9080809@libero.it> <4718FF47.40504@colorstudy.com>
Message-ID: <471908EB.6050509@libero.it>

Ian Bicking ha scritto:
> Manlio Perillo wrote:
>> The WSGI spec requires the response headers and sequence items to be, 
>> respectively, List of Tuples and Strings.
>>
>> However only for the response headers it explicitly requires them to 
>> be a Python List, i.e type(response_headers) is ListType.
>>
>> What about the other objects?
>>
>> In the current implementation of WSGI for Nginx I always use 
>> Py[Type]_Check, and not Py[Type]_CheckExact.
> 
> All of the types are required to be exactly as defined, not subclasses 
> or None.  But servers are not required to actually test this. 
> wsgiref.validate does test for exactly these types, but it's acceptable 
> for Nginx to just access the data without checking its exact type.
> 

Ok, thanks.

However it is not a problem to use Py[Type]_Check instead of 
Py[Type]_CheckExact (and it should not be slower), so if the types are 
required to be exactly as defined I think it is better to do the exact 
check.

In mod_wsgi for Nginx I'm doing a lot of checks (as an example I even 
check if the write callable is called from within application iterable)


Manlio Perillo

From graham.dumpleton at gmail.com  Sat Oct 20 11:24:56 2007
From: graham.dumpleton at gmail.com (Graham Dumpleton)
Date: Sat, 20 Oct 2007 19:24:56 +1000
Subject: [Web-SIG] about the status line in WSGI
In-Reply-To: <4718FA7D.1030202@libero.it>
References: <4718ADBB.6030804@libero.it>
	<F1962646D3B64642B7C9A06068EE1E64B4CB2F@ex10.hostedexchange.local>
	<4718FA7D.1030202@libero.it>
Message-ID: <88e286470710200224m6e799d73jc6f72d6c93e072ef@mail.gmail.com>

FWIW, I have seen people want to use (mod_python didn't support it
though), the description associated with a status so they could use
different values for a 200 response as part of some strange web
application testing framework.

Graham

On 20/10/2007, Manlio Perillo <manlio_perillo at libero.it> wrote:
> Robert Brewer ha scritto:
> > Manlio Perillo wrote:
> >> Is a WSGI gateway allowed to ignore the Reason-Phrase part of the
> >> status line returned by the WSGI application, and to use a server
> >> defined phrase?
> >
> > I would be sad if a WSGI gateway did that to me.
> > Why deny a web
> > application developer the right to control that part of the output?
> >
>
> The WSGI spec requires a full status line as a simplification for the
> WSGI Gateway and not to give more control to WSGI applications.
>
>
> Regards  Manlio Perillo
> _______________________________________________
> Web-SIG mailing list
> Web-SIG at python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe: http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com
>

From wilk at flibuste.net  Mon Oct 22 18:09:55 2007
From: wilk at flibuste.net (William Dode)
Date: Mon, 22 Oct 2007 16:09:55 +0000 (UTC)
Subject: [Web-SIG] WebOb
References: <46C3BCB9.6010708@colorstudy.com>
Message-ID: <ffii0j$mnj$1@ger.gmane.org>

Hi,

Since the announce of ian about webob, i did two things with it.

First i include it in my personal web framework, it was very easy, i had 
just to remove all my crappy equivalent functions. It make my framework 
a little bit more clean and i can inherit new features.

Second, most important, i wanted to start a little project without any 
framework to minimize the dependencies. So i started from scratch only 
with WebOb, the wsgiref server and a part of the example in routing_args 
specifications. It did it very quickly and the result should be 
compatible with any wsgi compliant pieces.

So, don't you think web-sig should officialy support such library ?
Include it in the lib stantard or in a wsgiorg library ?

Waiting for your view...

-- 
William Dod?  -  http://flibuste.net
Informaticien ind?pendant

I've hard to write in english language... please don't hesitate to give 
me somes advices in private !


From manlio_perillo at libero.it  Mon Oct 22 18:47:52 2007
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Mon, 22 Oct 2007 18:47:52 +0200
Subject: [Web-SIG] WebOb
In-Reply-To: <ffii0j$mnj$1@ger.gmane.org>
References: <46C3BCB9.6010708@colorstudy.com> <ffii0j$mnj$1@ger.gmane.org>
Message-ID: <471CD438.6080208@libero.it>

William Dode ha scritto:
> Hi,
> 
> Since the announce of ian about webob, i did two things with it.
> 
> First i include it in my personal web framework, it was very easy, i had 
> just to remove all my crappy equivalent functions. It make my framework 
> a little bit more clean and i can inherit new features.
> 
> Second, most important, i wanted to start a little project without any 
> framework to minimize the dependencies. So i started from scratch only 
> with WebOb, the wsgiref server and a part of the example in routing_args 
> specifications. It did it very quickly and the result should be 
> compatible with any wsgi compliant pieces.
> 
> So, don't you think web-sig should officialy support such library ?
> Include it in the lib stantard or in a wsgiorg library ?
> 
> Waiting for your view...
> 

I think that, first of all, we should standardize the utility functions 
for headers handling (parsing and serializing).


Regards  Manlio Perillo

From fdrake at gmail.com  Mon Oct 22 18:58:44 2007
From: fdrake at gmail.com (Fred Drake)
Date: Mon, 22 Oct 2007 12:58:44 -0400
Subject: [Web-SIG] WebOb
In-Reply-To: <ffii0j$mnj$1@ger.gmane.org>
References: <46C3BCB9.6010708@colorstudy.com> <ffii0j$mnj$1@ger.gmane.org>
Message-ID: <9cee7ab80710220958k79e0d77do34c55e8218a40889@mail.gmail.com>

On 10/22/07, William Dode <wilk at flibuste.net> wrote:
> So, don't you think web-sig should officialy support such library ?
> Include it in the lib stantard or in a wsgiorg library ?

I'm strongly against adding more non-Python-runtime batteries to the
standard library.  The plethora of packages already there makes
updating individual libraries to get bug fixes or features quite
painful.

This has nothing to do with WebOb in particular; I've not had a chance
to look at that yet.


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at gmail.com>
"Chaos is the score upon which reality is written." --Henry Miller

From guido at python.org  Mon Oct 22 19:01:52 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 22 Oct 2007 10:01:52 -0700
Subject: [Web-SIG] WebOb
In-Reply-To: <46C3BCB9.6010708@colorstudy.com>
References: <46C3BCB9.6010708@colorstudy.com>
Message-ID: <ca471dc20710221001p140eb439q1036b1d6003609d@mail.gmail.com>

2007/8/15, Ian Bicking <ianb at colorstudy.com>:
> Lately I got on a kick and extracted/refined/reimplemented a bunch of
> stuff from Paste.  The result is the not-quite-released WebOb (I don't
> want to do a release until I think people should use it instead of
> Paste, to the degree the two overlap -- and it's not *quite* ready for
> that).

Cool. I already heard in the grapevibe about webob.py.

> Anyway, I'd be interested in feedback.  We've talked a little about a
> shared request object -- only a little, and I don't know if it is really
> a realistic goal to even try.  But I think this request object is a
> considerably higher quality than any other request objects out there.
> The response object provides a nice symmetry, as well as facilitating
> testing.  And it's also a very nice response object.

I may be totally behind the times here, but I've always found it odd
to have separate request and response objects -- the functionalities
or APIs don't really overlap, so why not have a single object? I'm
really asking to be educated; I severely hope there's a better reason
than "Java did it this way". :-)

> They are both fairly reasonable to subclass, if there are minor naming
> issues (if there's really missing features, I'd like to add them
> directly -- though for the response object in particular it's likely
> you'll want to subclass to give application defaults, like a default
> content type).
>
> It's based strictly on WSGI, with the request object an almost-stateless
> wrapper around a WSGI environment, and the response object a WSGI
> application that contains mutable status/headers/app_iter.
>
> Almost all the defined HTTP headers are available as attributes on the
> request and/or response.  I try to parse these in as sensible a way as
> possible, e.g., req.if_modified_since is a datetime object (of course
> unparsed access is also available).  Several objects like
> response.cache_control are a bit more complex, since there's no data
> structure that exactly represents them.  I've tried to make them as easy
> to use as possible for realistic web tasks.

I'm interesting in something that's as lightweight as possible. Are
there things that take a reasonable time to parse that could be put
off until first use? Perhaps using properties to keep the simplest
possible API (or perhaps not to emphasize the cost of first use)?

> I'm very interested to get any feedback, especially right now when there
> are no backward compatibility concerns.  Right now no critique is too
> large or small.
>
> It's in svn at:
>    http://svn.pythonpaste.org/Paste/WebOb/trunk
>
> And there are fairly complete docs at:
>    http://pythonpaste.org/webob/

I briefly looked at the tutorial and was put off a little by the
interactive prompt style of the examples; that seems so unrealistic
that I wonder if it wouldn't be better to just say "put this in a file
and run it like this"?

> A quick summary of differences in the API and some other
> request/response objects out there:
>    http://pythonpaste.org/webob/differences.html
> I'd include more frameworks, if you can point me to their
> request/response API documentation (e.g., I looked but couldn't find any
> for Zope 3).

I'm not too familiar with other frameworks (having always hacked my
own, as it's so easy :-). Any chance of a summary that's not a
tutorial nor a reference?

> WebOb has a lot more methods and attributes than other libraries, but
> this document points out only things where there are differing names or
> things not in WebOb.  Most other such objects also don't have the same
> WSGI-oriented scope (with the exception of Yaro and paste.wsgiwrappers).
>
> The Request and Response API (extracted docs):
>    http://pythonpaste.org/webob/class-webob.Request.html
>    http://pythonpaste.org/webob/class-webob.Response.html

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From adam at atlas.st  Mon Oct 22 19:54:51 2007
From: adam at atlas.st (Adam Atlas)
Date: Mon, 22 Oct 2007 13:54:51 -0400
Subject: [Web-SIG] WebOb
In-Reply-To: <ffii0j$mnj$1@ger.gmane.org>
References: <46C3BCB9.6010708@colorstudy.com> <ffii0j$mnj$1@ger.gmane.org>
Message-ID: <83E2A065-9529-4E13-8ED6-13C7725879AC@atlas.st>

On 22 Oct 2007, at 12:09, William Dode wrote:

> So, don't you think web-sig should officialy support such library ?
> Include it in the lib stantard or in a wsgiorg library ?
>

I don't really like the idea of having something like this be part of  
the standard library; it's sort of neither here nor there between low- 
level WSGI and framework territory. I don't see people using  
something like WebOb to write their applications directly (nor does  
that seem to be the intention); just like Paste, it seems more like  
something that full frameworks would incorporate and provide access to.

Given the principle of "there should be one, and preferably only one,  
obvious way to do it", it seems like putting this in the standard  
library would be an endorsement of it as the obvious/best way, and  
although I like the WebOb approach, I don't think there's enough of a  
consensus to bless it thus. For now, the multitude of web frameworks  
and their various philosophies is a good thing.


From tseaver at palladion.com  Mon Oct 22 19:29:17 2007
From: tseaver at palladion.com (Tres Seaver)
Date: Mon, 22 Oct 2007 13:29:17 -0400
Subject: [Web-SIG] WebOb
In-Reply-To: <ca471dc20710221001p140eb439q1036b1d6003609d@mail.gmail.com>
References: <46C3BCB9.6010708@colorstudy.com>
	<ca471dc20710221001p140eb439q1036b1d6003609d@mail.gmail.com>
Message-ID: <471CDDED.9010208@palladion.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Guido van Rossum wrote:

> Cool. I already heard in the grapevibe about webob.py.
> 
>> Anyway, I'd be interested in feedback.  We've talked a little about a
>> shared request object -- only a little, and I don't know if it is really
>> a realistic goal to even try.  But I think this request object is a
>> considerably higher quality than any other request objects out there.
>> The response object provides a nice symmetry, as well as facilitating
>> testing.  And it's also a very nice response object.
> 
> I may be totally behind the times here, but I've always found it odd
> to have separate request and response objects -- the functionalities
> or APIs don't really overlap, so why not have a single object? I'm
> really asking to be educated; I severely hope there's a better reason
> than "Java did it this way". :-)

HTTP has both headers and payload supplied by the client and returned by
the server:  not mixing them up is probably the driving reason for
keeping separate objects.  Of course, you could make one object with
'request' and 'response' attributes, but that wouldn't really be a
simplification.

>> They are both fairly reasonable to subclass, if there are minor naming
>> issues (if there's really missing features, I'd like to add them
>> directly -- though for the response object in particular it's likely
>> you'll want to subclass to give application defaults, like a default
>> content type).
>>
>> It's based strictly on WSGI, with the request object an almost-stateless
>> wrapper around a WSGI environment, and the response object a WSGI
>> application that contains mutable status/headers/app_iter.
>>
>> Almost all the defined HTTP headers are available as attributes on the
>> request and/or response.  I try to parse these in as sensible a way as
>> possible, e.g., req.if_modified_since is a datetime object (of course
>> unparsed access is also available).  Several objects like
>> response.cache_control are a bit more complex, since there's no data
>> structure that exactly represents them.  I've tried to make them as easy
>> to use as possible for realistic web tasks.
> 
> I'm interesting in something that's as lightweight as possible. Are
> there things that take a reasonable time to parse that could be put
> off until first use? Perhaps using properties to keep the simplest
> possible API (or perhaps not to emphasize the cost of first use)?

The only big parsing load is going to be the request payload;
processing top-level request headers is normally trivial,
performance-wise.

I read Ian's concern as being about an API for setting / updating
cache-control response headers[1], because he found no natural mapping
for them as Python primitives.

[1] http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9


Tres.
- --
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHHN3s+gerLs4ltQ4RAjytAKCNejjJahOz2Q3seKpE4pcRiZ4TCQCgu+J2
FFeSFhO84s9n25M2p3d0VWQ=
=szPr
-----END PGP SIGNATURE-----


From guido at python.org  Mon Oct 22 21:10:54 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 22 Oct 2007 12:10:54 -0700
Subject: [Web-SIG] WebOb
In-Reply-To: <20071022190527.GA15050@smullyan.org>
References: <46C3BCB9.6010708@colorstudy.com>
	<ca471dc20710221001p140eb439q1036b1d6003609d@mail.gmail.com>
	<20071022190527.GA15050@smullyan.org>
Message-ID: <ca471dc20710221210o221b1b4hefda1645685d3515@mail.gmail.com>

Thanks! I stand educated.

2007/10/22, Jacob Smullyan <smulloni at smullyan.org>:
> On Mon, Oct 22, 2007 at 10:01:52AM -0700, Guido van Rossum wrote:
> > 2007/8/15, Ian Bicking <ianb at colorstudy.com>:
> > I may be totally behind the times here, but I've always found it odd
> > to have separate request and response objects -- the functionalities
> > or APIs don't really overlap, so why not have a single object? I'm
> > really asking to be educated; I severely hope there's a better reason
> > than "Java did it this way". :-)
>
> I'm hardly in a position to educate you, but here are my two cents.
>
> The aging but pleasant framework I've used for years, SkunkWeb (which you
> are free to think of as the amiable old drunk of the Python web development
> world) has always had a single Connection object for that reason. However,
> in skunkweb 4, I tossed it away and switched to using WebOb, because,
> although I somewhat prefer the aesthetic elegance of having one object
> rather than two, that preference is very slight, whereas Webob has many
> other advantages -- to my mind it is superbly done and it would be pointless
> to rewrite it -- and in fact I made request and response attributes of a
> single context object, which I suspect many framework authors would do, so
> instead of
>
>    CONNECTION.requestHeaders # SkunkWeb 3
>
> I now have
>
>    Context.request.headers # SkunkWeb 4
>
> which is fine by me.
>
> And there are cases when you might want a request or response without really
> needing the other.  For instance, what would be the point of having WebOb's
> HTTPException classes, which are response subclasses, also be requests?  And
> middleware might not be interested at all in the response -- so why should
> they deal with an object larded with response-specific attributes, and
> possibly requiring those attributes to undergo initialization?  (Well, there
> isn't much initialization necessary, I suppose.) Not having to refer to
> things at times you you don't care about them is an architectural good which
> offsets to some degree the clumsiness of having two closely related things
> rather than one when you care about them both.
>
>
> Cheers,
>
> js
>
> --
> Jacob Smullyan
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ianb at colorstudy.com  Mon Oct 22 21:26:53 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon, 22 Oct 2007 14:26:53 -0500
Subject: [Web-SIG] WebOb
In-Reply-To: <ca471dc20710221001p140eb439q1036b1d6003609d@mail.gmail.com>
References: <46C3BCB9.6010708@colorstudy.com>
	<ca471dc20710221001p140eb439q1036b1d6003609d@mail.gmail.com>
Message-ID: <471CF97D.6070808@colorstudy.com>

Guido van Rossum wrote:
>> Anyway, I'd be interested in feedback.  We've talked a little about a
>> shared request object -- only a little, and I don't know if it is really
>> a realistic goal to even try.  But I think this request object is a
>> considerably higher quality than any other request objects out there.
>> The response object provides a nice symmetry, as well as facilitating
>> testing.  And it's also a very nice response object.
> 
> I may be totally behind the times here, but I've always found it odd
> to have separate request and response objects -- the functionalities
> or APIs don't really overlap, so why not have a single object? I'm
> really asking to be educated; I severely hope there's a better reason
> than "Java did it this way". :-)

There are several headers that exist in both the request and the 
response.  For instance, Content-Type, Content-Length, and 
Cache-Control.  Additionally, a lot of headers aren't immediately 
obvious -- is Location a request or response header?  Well, response, 
but if all the headers are mixed together it takes a bit of thought to 
realize that.

The WebOb request and response are mostly representations of the HTTP 
messages, and there's two distinct messages which look very similar, 
which makes them hard to mix into one object.

>> They are both fairly reasonable to subclass, if there are minor naming
>> issues (if there's really missing features, I'd like to add them
>> directly -- though for the response object in particular it's likely
>> you'll want to subclass to give application defaults, like a default
>> content type).
>>
>> It's based strictly on WSGI, with the request object an almost-stateless
>> wrapper around a WSGI environment, and the response object a WSGI
>> application that contains mutable status/headers/app_iter.
>>
>> Almost all the defined HTTP headers are available as attributes on the
>> request and/or response.  I try to parse these in as sensible a way as
>> possible, e.g., req.if_modified_since is a datetime object (of course
>> unparsed access is also available).  Several objects like
>> response.cache_control are a bit more complex, since there's no data
>> structure that exactly represents them.  I've tried to make them as easy
>> to use as possible for realistic web tasks.
> 
> I'm interesting in something that's as lightweight as possible. Are
> there things that take a reasonable time to parse that could be put
> off until first use? Perhaps using properties to keep the simplest
> possible API (or perhaps not to emphasize the cost of first use)?

Almost everything is a property.  This is in part because state is kept 
in the native WSGI forms (environ, status, headers, app_iter), so 
everything is calculated off of these.  It also makes instantiation 
relatively light.  Even the request body is left alone until 
request.POST is accessed.

>> I'm very interested to get any feedback, especially right now when there
>> are no backward compatibility concerns.  Right now no critique is too
>> large or small.
>>
>> It's in svn at:
>>    http://svn.pythonpaste.org/Paste/WebOb/trunk
>>
>> And there are fairly complete docs at:
>>    http://pythonpaste.org/webob/
> 
> I briefly looked at the tutorial and was put off a little by the
> interactive prompt style of the examples; that seems so unrealistic
> that I wonder if it wouldn't be better to just say "put this in a file
> and run it like this"?

The side effect of doctesting is that docs sometimes look weird :-/

I'm not sure what form the docs should take.  I'm open to suggestions. 
The extracted docs are actually reasonable as a reference, I think:

http://pythonpaste.org/webob/class-webob.Request.html
http://pythonpaste.org/webob/class-webob.Response.html

For realistic use cases, some kind of infrastructure is necessary.  I 
suppose a simple example using the wsgiref server and a plain WSGI app 
would suffice.  Even a very small framework (e.g., 
http://svn.pythonpaste.org/Paste/apps/FlatAtomPub/trunk/flatatompub/dec.py) 
improves that considerably, but probably isn't worth introducing.

>> A quick summary of differences in the API and some other
>> request/response objects out there:
>>    http://pythonpaste.org/webob/differences.html
>> I'd include more frameworks, if you can point me to their
>> request/response API documentation (e.g., I looked but couldn't find any
>> for Zope 3).
> 
> I'm not too familiar with other frameworks (having always hacked my
> own, as it's so easy :-). Any chance of a summary that's not a
> tutorial nor a reference?

Did you look at the file serving example? 
http://pythonpaste.org/webob/file-example.html

I suppose a quick summary would also be possible, covering just the most 
important attributes and with a quick listing of others (like all the 
properties for the individual HTTP headers).


-- 
Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org
             : Write code, do good : http://topp.openplans.org/careers

From smulloni at smullyan.org  Mon Oct 22 21:05:27 2007
From: smulloni at smullyan.org (Jacob Smullyan)
Date: Mon, 22 Oct 2007 15:05:27 -0400
Subject: [Web-SIG] WebOb
In-Reply-To: <ca471dc20710221001p140eb439q1036b1d6003609d@mail.gmail.com>
References: <46C3BCB9.6010708@colorstudy.com>
	<ca471dc20710221001p140eb439q1036b1d6003609d@mail.gmail.com>
Message-ID: <20071022190527.GA15050@smullyan.org>

On Mon, Oct 22, 2007 at 10:01:52AM -0700, Guido van Rossum wrote:
> 2007/8/15, Ian Bicking <ianb at colorstudy.com>:
> I may be totally behind the times here, but I've always found it odd
> to have separate request and response objects -- the functionalities
> or APIs don't really overlap, so why not have a single object? I'm
> really asking to be educated; I severely hope there's a better reason
> than "Java did it this way". :-)

I'm hardly in a position to educate you, but here are my two cents.  

The aging but pleasant framework I've used for years, SkunkWeb (which you
are free to think of as the amiable old drunk of the Python web development
world) has always had a single Connection object for that reason. However,
in skunkweb 4, I tossed it away and switched to using WebOb, because,
although I somewhat prefer the aesthetic elegance of having one object
rather than two, that preference is very slight, whereas Webob has many
other advantages -- to my mind it is superbly done and it would be pointless
to rewrite it -- and in fact I made request and response attributes of a
single context object, which I suspect many framework authors would do, so
instead of 
 
   CONNECTION.requestHeaders # SkunkWeb 3
   
I now have

   Context.request.headers # SkunkWeb 4
   
which is fine by me.  

And there are cases when you might want a request or response without really
needing the other.  For instance, what would be the point of having WebOb's
HTTPException classes, which are response subclasses, also be requests?  And
middleware might not be interested at all in the response -- so why should
they deal with an object larded with response-specific attributes, and
possibly requiring those attributes to undergo initialization?  (Well, there
isn't much initialization necessary, I suppose.) Not having to refer to
things at times you you don't care about them is an architectural good which
offsets to some degree the clumsiness of having two closely related things
rather than one when you care about them both.


Cheers, 

js

-- 
Jacob Smullyan

From guido at python.org  Mon Oct 22 21:40:18 2007
From: guido at python.org (Guido van Rossum)
Date: Mon, 22 Oct 2007 12:40:18 -0700
Subject: [Web-SIG] WebOb
In-Reply-To: <471CF97D.6070808@colorstudy.com>
References: <46C3BCB9.6010708@colorstudy.com>
	<ca471dc20710221001p140eb439q1036b1d6003609d@mail.gmail.com>
	<471CF97D.6070808@colorstudy.com>
Message-ID: <ca471dc20710221240m3b41f09bo839101d7cc18539c@mail.gmail.com>

2007/10/22, Ian Bicking <ianb at colorstudy.com>:
> > I briefly looked at the tutorial and was put off a little by the
> > interactive prompt style of the examples; that seems so unrealistic
> > that I wonder if it wouldn't be better to just say "put this in a file
> > and run it like this"?
>
> The side effect of doctesting is that docs sometimes look weird :-/

Personally, I find doctest a great tool for writing tests in certain
situations; not so great for writing docs though.

> I'm not sure what form the docs should take.  I'm open to suggestions.
> The extracted docs are actually reasonable as a reference, I think:
>
> http://pythonpaste.org/webob/class-webob.Request.html
> http://pythonpaste.org/webob/class-webob.Response.html

Hm, these are mostly alphabetical listings of individual methods and
properties. I'm still hoping for something that I can read from top to
bottom in 10 minutes and get an idea of what this is and how to use
it.

> For realistic use cases, some kind of infrastructure is necessary.

How realistic are we talking? I'm thinking of something that I can
test by pointing my browser to localhost:8080 or similar. For CGI
scripts, the standard library's CGIHTTPServer would suffice. How hard
is it to create something similar for WSGI or for webob?

> I suppose a simple example using the wsgiref server and a plain WSGI app
> would suffice.  Even a very small framework (e.g.,
> http://svn.pythonpaste.org/Paste/apps/FlatAtomPub/trunk/flatatompub/dec.py)
> improves that considerably, but probably isn't worth introducing.

It's hard to judge that code since it has zero documentation. I was
more looking for something that has a main() which is called when
invoked as a script.

> >> A quick summary of differences in the API and some other
> >> request/response objects out there:
> >>    http://pythonpaste.org/webob/differences.html
> >> I'd include more frameworks, if you can point me to their
> >> request/response API documentation (e.g., I looked but couldn't find any
> >> for Zope 3).
> >
> > I'm not too familiar with other frameworks (having always hacked my
> > own, as it's so easy :-). Any chance of a summary that's not a
> > tutorial nor a reference?
>
> Did you look at the file serving example?
> http://pythonpaste.org/webob/file-example.html

Thatr's the first thing I looked at, and that prompted my comments above. :-)

> I suppose a quick summary would also be possible, covering just the most
> important attributes and with a quick listing of others (like all the
> properties for the individual HTTP headers).

Yes please.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From ianb at colorstudy.com  Mon Oct 22 23:39:54 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon, 22 Oct 2007 16:39:54 -0500
Subject: [Web-SIG] WebOb
In-Reply-To: <83E2A065-9529-4E13-8ED6-13C7725879AC@atlas.st>
References: <46C3BCB9.6010708@colorstudy.com> <ffii0j$mnj$1@ger.gmane.org>
	<83E2A065-9529-4E13-8ED6-13C7725879AC@atlas.st>
Message-ID: <471D18AA.8030702@colorstudy.com>

Adam Atlas wrote:
> On 22 Oct 2007, at 12:09, William Dode wrote:
> 
>> So, don't you think web-sig should officialy support such library ?
>> Include it in the lib stantard or in a wsgiorg library ?
>>
> 
> I don't really like the idea of having something like this be part of  
> the standard library; it's sort of neither here nor there between low- 
> level WSGI and framework territory. I don't see people using  
> something like WebOb to write their applications directly (nor does  
> that seem to be the intention); just like Paste, it seems more like  
> something that full frameworks would incorporate and provide access to.

I am certainly not representative of a normal developer, but I have been 
using it quite successfully without any framework.  It also provides 
most of the functionality of WebTest, a framework-neutral functional 
testing tool, as another example.

> Given the principle of "there should be one, and preferably only one,  
> obvious way to do it", it seems like putting this in the standard  
> library would be an endorsement of it as the obvious/best way, and  
> although I like the WebOb approach, I don't think there's enough of a  
> consensus to bless it thus. For now, the multitude of web frameworks  
> and their various philosophies is a good thing.

After actually reading the APIs of the different request objects and 
summarizing the differences, I feel much less like this.  All the major 
frameworks (and almost all the minor frameworks) have request and 
response objects with a subset of the same properties, and some slightly 
different names.  The only really substantial exceptions are Zope and 
CherryPy that have a bunch of traversal-related properties and methods; 
but even these have some parallels in WebOb.

I've also tried to avoid gratuitous incompatibilities with other 
frameworks, and to allow backward compatibility through subclassing when 
there are API differences.  There's still some tricky details -- for 
instance, Django uses a different multi-value dictionary API than WebOb 
uses.  Which is the kind of thing that makes me wish *some* multi-value 
dictionary API existed in the standard library that could serve as a 
reasonable model.  But so it goes.  Even there I switched around WebOb 
some to be closer to Django (to prefer the last value over the first 
value, when getting a single value when multiple values are available).

As for actual consensus, Pylons is committed to using it and TurboGears 
by association.  Jacob Kaplan-Moss and Simon Willison have expressed 
specific interest in the idea for Django, though I don't think they've 
had the time to analyze what that would mean specifically.  Jacob 
Smullyan is also using it as we've heard, and I've heard of some other 
smaller/internal frameworks using it.  That's not consensus, but I think 
it points to the possibility of consensus.

As to the standard library, I don't know, there's a lot of issues with 
its development model.  WebOb, unlike a framework, actually *could* 
match the kind of slow and steady progress that the standard library 
has.  But the stdlib might be a bad target even so.


-- 
Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org
             : Write code, do good : http://topp.openplans.org/careers

From MDiPierro at cti.depaul.edu  Tue Oct 23 06:42:48 2007
From: MDiPierro at cti.depaul.edu (Massimo Di Pierro)
Date: Mon, 22 Oct 2007 23:42:48 -0500
Subject: [Web-SIG] Gluon again
Message-ID: <68C644CF-0429-46ED-8762-CB2C57E0C582@cti.depaul.edu>

I posted a Gluon tutorial here

     http://mdp.cti.depaul.edu/examples/static/cookbook.pdf

it shows step by step how to build a web app to store recipes and  
group them by category.
It is a first draft so there are may be some english some typos. Sorry.

Massimo

P.S. I'll never stress it enough. Gluon is GPL2, it is not a  
commercial product. The reason I am emailing you about this is  
because I know I can find experts here and I hope you can help me  
find bugs so that I can fix them and improve Gluons. If there is  
functionality that you need and you think is not there, just let me  
know and I will see what I can do. I would also love to see an ajax  
enthusiast take the challenge to write the first ajax app using  
Gluon, scriptaculous and json. I do provide some free email support  
if you sign up on the Gluon google group.

From std3rr at gmail.com  Tue Oct 23 06:57:24 2007
From: std3rr at gmail.com (Joshua Simpson)
Date: Mon, 22 Oct 2007 21:57:24 -0700
Subject: [Web-SIG] Gluon again
In-Reply-To: <68C644CF-0429-46ED-8762-CB2C57E0C582@cti.depaul.edu>
References: <68C644CF-0429-46ED-8762-CB2C57E0C582@cti.depaul.edu>
Message-ID: <3ed9caa10710222157j3451ad4eg37ac9b53dcf4036e@mail.gmail.com>

On 10/22/07, Massimo Di Pierro <MDiPierro at cti.depaul.edu> wrote:

it shows step by step how to build a web app to store recipes and
> group them by category.
> It is a first draft so there are may be some english some typos. Sorry.


I'm going to check this out.  Are you from a primarily C background?  Your
builtin functions look, at least in naming convention, suspiciously like
macros.  Your controller design seems to borrow heavily from Django, but I
suppose that's a good thing.

Cheers, I always like to look at new frameworks.

Josh

-- 
-
http://stderr.ws/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/web-sig/attachments/20071022/97429ca5/attachment.htm 

From ianb at colorstudy.com  Tue Oct 23 07:45:35 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 23 Oct 2007 00:45:35 -0500
Subject: [Web-SIG] WebOb
In-Reply-To: <ca471dc20710221240m3b41f09bo839101d7cc18539c@mail.gmail.com>
References: <46C3BCB9.6010708@colorstudy.com>	
	<ca471dc20710221001p140eb439q1036b1d6003609d@mail.gmail.com>	
	<471CF97D.6070808@colorstudy.com>
	<ca471dc20710221240m3b41f09bo839101d7cc18539c@mail.gmail.com>
Message-ID: <471D8A7F.2050405@colorstudy.com>

Guido van Rossum wrote:
> 2007/10/22, Ian Bicking <ianb at colorstudy.com>:
>>> I briefly looked at the tutorial and was put off a little by the
>>> interactive prompt style of the examples; that seems so unrealistic
>>> that I wonder if it wouldn't be better to just say "put this in a file
>>> and run it like this"?
>> The side effect of doctesting is that docs sometimes look weird :-/
> 
> Personally, I find doctest a great tool for writing tests in certain
> situations; not so great for writing docs though.

Yeah... I really like it in a lot of ways, but I'm not quite sure what 
the right balance is.  Untested documentation is also very unfortunate; 
too much potential for drift.

>> I'm not sure what form the docs should take.  I'm open to suggestions.
>> The extracted docs are actually reasonable as a reference, I think:
>>
>> http://pythonpaste.org/webob/class-webob.Request.html
>> http://pythonpaste.org/webob/class-webob.Response.html
> 
> Hm, these are mostly alphabetical listings of individual methods and
> properties. I'm still hoping for something that I can read from top to
> bottom in 10 minutes and get an idea of what this is and how to use
> it.

I redid the front page to make it more brief: http://pythonpaste.org/webob/

I stopped with the example, because I couldn't think of a good example. 
  Maybe a different evening.  Suggestions of course welcome.

>> For realistic use cases, some kind of infrastructure is necessary.
> 
> How realistic are we talking? I'm thinking of something that I can
> test by pointing my browser to localhost:8080 or similar. For CGI
> scripts, the standard library's CGIHTTPServer would suffice. How hard
> is it to create something similar for WSGI or for webob?

Well, some kind of WSGI adapter; the wsgiref one is fine.  The file 
example I guess is boring, because without some kind of dispatch you can 
only serve up one file.  A most boring server.

Wiki is a common example, but a little too common at this point.  WebOb 
doesn't offer anything for HTML either, so it would be a somewhat 
unsatisfying example anyway I suspect.

-- 
Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org

From jim at zope.com  Tue Oct 23 13:11:53 2007
From: jim at zope.com (Jim Fulton)
Date: Tue, 23 Oct 2007 07:11:53 -0400
Subject: [Web-SIG] WebOb
In-Reply-To: <471D8A7F.2050405@colorstudy.com>
References: <46C3BCB9.6010708@colorstudy.com>	
	<ca471dc20710221001p140eb439q1036b1d6003609d@mail.gmail.com>	
	<471CF97D.6070808@colorstudy.com>
	<ca471dc20710221240m3b41f09bo839101d7cc18539c@mail.gmail.com>
	<471D8A7F.2050405@colorstudy.com>
Message-ID: <ACDE20C3-C8C8-476F-AC15-17B39A44CE69@zope.com>

> I redid the front page to make it more brief: http:// 
> pythonpaste.org/webob/

I suggest a paragraph saying what WebOb is, including what problem it  
is trying to solve.  I'd find this interesting as it is not at all  
clear to me.

Jim

--
Jim Fulton
Zope Corporation


From MDiPierro at cti.depaul.edu  Tue Oct 23 15:32:56 2007
From: MDiPierro at cti.depaul.edu (Massimo Di Pierro)
Date: Tue, 23 Oct 2007 08:32:56 -0500
Subject: [Web-SIG] Gluon again
In-Reply-To: <3ed9caa10710222157j3451ad4eg37ac9b53dcf4036e@mail.gmail.com>
References: <68C644CF-0429-46ED-8762-CB2C57E0C582@cti.depaul.edu>
	<3ed9caa10710222157j3451ad4eg37ac9b53dcf4036e@mail.gmail.com>
Message-ID: <62AB0C0A-64D0-4E5A-BD69-A689774E9B85@cti.depaul.edu>


You probably refer to the fact that validators and helpers are upper  
case. That is because they are not functions but objects. In fact  
validators have an internal state (the parameters for the validation,  
the translated error messages etc.) and helpers have an internal  
state (because they are aware of form they may contain, their  
variables and their errors). Example

a=FORM(TABLE(TR(TD(INPUT(_name='field',requites=IS_NOT_EMPTY()))),TR 
(TD(INPUT(_type='submit')))))
if a.accepts(request.vars,session): ....
if a.errors:...

At its fundamental level I tried to make Gluon similar to Django. For  
two reasons. I know Django (I taught a class on Django here ad  
DePaul) and I liked it but I found it has too many functions and too  
many modules to remember. So I decided to follow a "convention over  
configuration" approach a la RoR.
In Gluon you do not need to import Gluon's modules in your code nor  
you need to explicitly call the template renderer, for example. Same  
logic as Django but simpler to use I believe.

Massimo

To answer your first question: I teach computer science, mostly  
numerical applications to science and finance, occasionally  
networking stuff and security.
You can say I came from a C++ background. My most important work is  
fermiqcd a C++ library of parallel lattice quantum chromodynamics.


On Oct 22, 2007, at 11:57 PM, Joshua Simpson wrote:

>
>
> On 10/22/07, Massimo Di Pierro <MDiPierro at cti.depaul.edu> wrote:
>
> it shows step by step how to build a web app to store recipes and
> group them by category.
> It is a first draft so there are may be some english some typos.  
> Sorry.
>
> I'm going to check this out.  Are you from a primarily C  
> background?  Your builtin functions look, at least in naming  
> convention, suspiciously like macros.  Your controller design seems  
> to borrow heavily from Django, but I suppose that's a good thing.
>
> Cheers, I always like to look at new frameworks.
>
> Josh
>
> -- 
> -
> http://stderr.ws/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/web-sig/attachments/20071023/bd4fad05/attachment.htm 

From guido at python.org  Tue Oct 23 16:01:46 2007
From: guido at python.org (Guido van Rossum)
Date: Tue, 23 Oct 2007 07:01:46 -0700
Subject: [Web-SIG] WebOb
In-Reply-To: <471D8A7F.2050405@colorstudy.com>
References: <46C3BCB9.6010708@colorstudy.com>
	<ca471dc20710221001p140eb439q1036b1d6003609d@mail.gmail.com>
	<471CF97D.6070808@colorstudy.com>
	<ca471dc20710221240m3b41f09bo839101d7cc18539c@mail.gmail.com>
	<471D8A7F.2050405@colorstudy.com>
Message-ID: <ca471dc20710230701j526b0d7bj337c1bf81692d102@mail.gmail.com>

2007/10/22, Ian Bicking <ianb at colorstudy.com>:
> I redid the front page to make it more brief: http://pythonpaste.org/webob/

Much better; I'll try to review it in more detail later. Right now a
few details jump off the page to me: GET and POST are verbs and IMO
poor names for what they represent; params is usually called query
(isn't it?); and what's the advantage of using Request.blank() instead
of simply Request()?

> I stopped with the example, because I couldn't think of a good example.
>   Maybe a different evening.  Suggestions of course welcome.
>
> >> For realistic use cases, some kind of infrastructure is necessary.
> >
> > How realistic are we talking? I'm thinking of something that I can
> > test by pointing my browser to localhost:8080 or similar. For CGI
> > scripts, the standard library's CGIHTTPServer would suffice. How hard
> > is it to create something similar for WSGI or for webob?
>
> Well, some kind of WSGI adapter; the wsgiref one is fine.  The file
> example I guess is boring, because without some kind of dispatch you can
> only serve up one file.  A most boring server.
>
> Wiki is a common example, but a little too common at this point.  WebOb
> doesn't offer anything for HTML either, so it would be a somewhat
> unsatisfying example anyway I suspect.

The file-serving example has several shortcomings: the presentation
order seems odd, some things are introduced without explanation of
what or why. (Why is UTC imported? Why is mimetypes imported twice?
Why bother with calculating the mime-type at all in the first
example?) Towards the end it seems to go into too many details of
serving up conditional responses and file ranges, which seem better
suited for an advanced manual.

I suggest the wiki-in-one-page would be a better example, even if you
consider it too common (serving static files isn't common? :-).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)

From tseaver at palladion.com  Tue Oct 23 16:14:47 2007
From: tseaver at palladion.com (Tres Seaver)
Date: Tue, 23 Oct 2007 10:14:47 -0400
Subject: [Web-SIG] WebOb
In-Reply-To: <ca471dc20710230701j526b0d7bj337c1bf81692d102@mail.gmail.com>
References: <46C3BCB9.6010708@colorstudy.com>	<ca471dc20710221001p140eb439q1036b1d6003609d@mail.gmail.com>	<471CF97D.6070808@colorstudy.com>	<ca471dc20710221240m3b41f09bo839101d7cc18539c@mail.gmail.com>	<471D8A7F.2050405@colorstudy.com>
	<ca471dc20710230701j526b0d7bj337c1bf81692d102@mail.gmail.com>
Message-ID: <471E01D7.2050605@palladion.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Guido van Rossum wrote:
> 2007/10/22, Ian Bicking <ianb at colorstudy.com>:
>> I redid the front page to make it more brief: http://pythonpaste.org/webob/
> 
> Much better; I'll try to review it in more detail later. Right now a
> few details jump off the page to me: GET and POST are verbs and IMO
> poor names for what they represent;

Just MHO:  I don't find them that confusing.  Would names like
'GET_data' and 'POST_data' be clearer?  Coming from Zope land, I'm not
used to caring about the distinction between GET and POST (for purposes
of examining the parameters passed in the request), so I'd probably use
'params' instead.

> params is usually called query (isn't it?);

Depends on what you mean by "usually":  in Zope, this is called 'form',
and it represents either the parsed query string (for GET requests) or
the parsed form data from the payload (for POST requests).

> and what's the advantage of using Request.blank() instead
> of simply Request()?

'blank' represents an unusual case:  fabricating a request object
without having a WSGI-compliant environment dict already in hand.  I
kind of like simplifying the "mainline" case (__init__ doesn't have to
sniff whether you passed an environment or not:  you get a TypeError if
you try).


- --
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHHgHX+gerLs4ltQ4RAgJtAKCR4s2LFi/Nb4aYgF/aLilwa+PvnwCaAxpI
BsTZMtcoY+NJpI3EQ/RkBKg=
=RQSZ
-----END PGP SIGNATURE-----


From wilk at flibuste.net  Tue Oct 23 16:45:55 2007
From: wilk at flibuste.net (William Dode)
Date: Tue, 23 Oct 2007 14:45:55 +0000 (UTC)
Subject: [Web-SIG] WebOb
References: <46C3BCB9.6010708@colorstudy.com>
	<ca471dc20710221001p140eb439q1036b1d6003609d@mail.gmail.com>
	<471CF97D.6070808@colorstudy.com>
	<ca471dc20710221240m3b41f09bo839101d7cc18539c@mail.gmail.com>
	<471D8A7F.2050405@colorstudy.com>
Message-ID: <ffl1f3$dbh$1@ger.gmane.org>

On 23-10-2007, Ian Bicking wrote:
> I redid the front page to make it more brief: 
> http://pythonpaste.org/webob/

Fine.  

I had to use it to understand what is the benefit of webob, the examples 
was not very clear in the first read.

The yaro's page was more clear to me for example.

>
> I stopped with the example, because I couldn't think of a good example. 
>   Maybe a different evening.  Suggestions of course welcome.

The problem will be to be practical but don't look like 'yet another 
framework' !

I liked your do-it-yourself-framework. Maybe a webob-only version ?

Each example should run alone with copy-paste and wsgiref as server.

Without webob:
--------------

import wsgiref.simple_server

def app(environ, start_response):
    start_response('200 OK', [('content-type', 'text/html')])
    return ['Hello world!']

wsgiref.simple_server.make_server('', 8080, app).serve_forever()


With webob:
-----------

import wsgiref.simple_server
from webob import Response, Request

def app(environ, start_response):
    req = Request(environ)
    res = Response(content_type='text/html')
    res.body = 'Hello world!'
    return res(environ, start_response)

wsgiref.simple_server.make_server('', 8080, app).serve_forever()

With form :
-----------

import wsgiref.simple_server
from webob import Response, Request

def app(environ, start_response):
    req = Request(environ)
    res = Response(content_type='text/html')
    you = req.params.get('you')
    if you:
        res.body_file.write('Hello %s' % you)
    res.body_file.write('''<form>
    Who are you ? <input name='you'>
    <input type='submit'>
    </form>''')

    return res(environ, start_response)

wsgiref.simple_server.make_server('', 8080, app).serve_forever()


with form and cookies :
-----------------------

import wsgiref.simple_server
from webob import Response, Request

def app(environ, start_response):
    req = Request(environ)
    res = Response(content_type='text/html')
    you_cookie = req.cookies.get('you')
    if you_cookie:
        res.body_file.write('I recognize you %s<br>' % you_cookie)
    you = req.params.get('you', you_cookie)
    if you:
        res.body_file.write('Hello %s' % you)
        res.set_cookie('you', you)
    res.body_file.write('''<form>
    Who are you ? <input name='you'>
    <input type='submit'>
    </form>''')

    return res(environ, start_response)

wsgiref.simple_server.make_server('', 8080, app).serve_forever()

-- 
William Dod?  -  http://flibuste.net
Informaticien ind?pendant


From ianb at colorstudy.com  Tue Oct 23 19:33:26 2007
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 23 Oct 2007 12:33:26 -0500
Subject: [Web-SIG] WebOb
In-Reply-To: <ca471dc20710230701j526b0d7bj337c1bf81692d102@mail.gmail.com>
References: <46C3BCB9.6010708@colorstudy.com>	
	<ca471dc20710221001p140eb439q1036b1d6003609d@mail.gmail.com>	
	<471CF97D.6070808@colorstudy.com>	
	<ca471dc20710221240m3b41f09bo839101d7cc18539c@mail.gmail.com>	
	<471D8A7F.2050405@colorstudy.com>
	<ca471dc20710230701j526b0d7bj337c1bf81692d102@mail.gmail.com>
Message-ID: <471E3066.6050200@colorstudy.com>

Guido van Rossum wrote:
> 2007/10/22, Ian Bicking <ianb at colorstudy.com>:
>> I redid the front page to make it more brief: http://pythonpaste.org/webob/
> 
> Much better; I'll try to review it in more detail later. Right now a
> few details jump off the page to me: GET and POST are verbs and IMO
> poor names for what they represent; 

I generally agree, and initially they were named queryvars and postvars. 
  But I provided GET and POST aliases for compatibility with both Pylons 
and Django, and then I kind of decided that though they are technically 
incorrect (e.g., GET variables are really query string variables, and 
can be present in POST requests) that it wasn't worth the ambiguity of 
aliases, and I didn't want to just change the names.

> params is usually called query (isn't it?);

I'm not aware of any particular convention for this.  In Django it's 
request.REQUEST, in Werkzeug it is req.values, in Webware it was 
accessed with request.value(name), and I believe CherryPy uses 
request.params.  So there isn't any convention that I know of.

> and what's the advantage of using Request.blank() instead
> of simply Request()?

As Tres said, it creates a request from scratch, building the WSGI 
dictionary.  I use it for testing and potentially for artificial 
requests or subrequests (though subrequests usually work better with 
request.copy_get()).  When you are serving an application the WSGI 
environment will always come from the WSGI server.

>> I stopped with the example, because I couldn't think of a good example.
>>   Maybe a different evening.  Suggestions of course welcome.
>>
>>>> For realistic use cases, some kind of infrastructure is necessary.
>>> How realistic are we talking? I'm thinking of something that I can
>>> test by pointing my browser to localhost:8080 or similar. For CGI
>>> scripts, the standard library's CGIHTTPServer would suffice. How hard
>>> is it to create something similar for WSGI or for webob?
>> Well, some kind of WSGI adapter; the wsgiref one is fine.  The file
>> example I guess is boring, because without some kind of dispatch you can
>> only serve up one file.  A most boring server.
>>
>> Wiki is a common example, but a little too common at this point.  WebOb
>> doesn't offer anything for HTML either, so it would be a somewhat
>> unsatisfying example anyway I suspect.
> 
> The file-serving example has several shortcomings: the presentation
> order seems odd, some things are introduced without explanation of
> what or why. (Why is UTC imported? Why is mimetypes imported twice?
> Why bother with calculating the mime-type at all in the first
> example?) Towards the end it seems to go into too many details of
> serving up conditional responses and file ranges, which seem better
> suited for an advanced manual.
> 
> I suggest the wiki-in-one-page would be a better example, even if you
> consider it too common (serving static files isn't common? :-).

But I love static files!  I wonder if there's an interesting piece of 
middleware I could do -- WebOb makes middleware much easier IMHO.  Of 
course, it's only interesting if you have something on the other end of 
your middleware.

Maybe a backend app that serves files and knows GET and PUT, and then 
middleware that turns it into a wiki?  Or is that too clever? 
Authentication middleware with a login page?  Maybe too meta.


-- 
Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org

From MDiPierro at cti.depaul.edu  Tue Oct 30 06:26:00 2007
From: MDiPierro at cti.depaul.edu (Massimo Di Pierro)
Date: Tue, 30 Oct 2007 00:26:00 -0500
Subject: [Web-SIG] wsgi?
Message-ID: <C29BFA50-83EC-4C4A-8310-9250B24FFB78@cti.depaul.edu>

I am trying to use Gluon with Apache and mod_wsgi.

This is how Gluon starts now using Paste httpserver (serve)

def main(ip='127.0.0.1',port=8000):
     serve(wsgibase,server_version="Something", host=ip, port=str(port))

I am not looking for explanation, I can figure it out myself, it is  
the time that is lacking.
I am looking for a wsgi expert who is interested in Gluon and is  
willing to try set it up with wsgi and submit one page of  
documentation on how to do it, in exchange for a lousy acknowledgment  
on the Gluon web site.

Massimo