From ptspts at gmail.com Sun Jan 3 17:18:33 2010 From: ptspts at gmail.com (=?ISO-8859-1?Q?P=E9ter_Szab=F3?=) Date: Sun, 3 Jan 2010 17:18:33 +0100 Subject: [Web-SIG] WSGI (PEP 333) error handling questions In-Reply-To: <4fa38b911001030756m73ba990q230d291480cead03@mail.gmail.com> References: <4fa38b911001030756m73ba990q230d291480cead03@mail.gmail.com> Message-ID: <4fa38b911001030818u4fbd20a6yaca4538e6869389f@mail.gmail.com> Hi, After reading http://www.python.org/dev/peps/pep-0333/ it is not clear to me what a WSGI server should do in case of some error situations: S1. An I/O error (such as Connection reset by peer) occurs when reading environ['wsgi.input']. S2. An I/O error occurs when writing environ['wsgi.error']. S3. An I/O error occurs when start_response is sending the HTTP response headers. S4. An I/O error (such as Connection reset by peer or Broken pipe) occurs when writing to the write() callable returned by start_response(). S5. An I/O error (such as Connection reset by peer or Broken pipe) occurs when yielding some HTTP response strings. My guesses would be: S1. Raise an IOError. If the application doesn't catch it, then abort the call to the application, and close the HTTP connection. S2a. Same as S1. S2b. Ignore the error silently. S3a. Same as S1. S3b. Close the HTTP connection, let the application call continue, and ignore the error silently. S4a. Same as S1. S4b. Close the HTTP connection, let the application call continue, and ignore the error silently. S5a. Abort the application call (without giving the application a chance to catch the exception) and close the HTTP connection. S5b. Close the HTTP connection, let the application call continue, and ignore the error silently. Could you please clarify what should happen in these situations? Thanks, P?ter From arw1961 at yahoo.com Sun Jan 3 19:04:10 2010 From: arw1961 at yahoo.com (Aaron Watters) Date: Sun, 3 Jan 2010 10:04:10 -0800 (PST) Subject: [Web-SIG] wsgi write=start_response() and iterable return? In-Reply-To: <4fa38b911001030818u4fbd20a6yaca4538e6869389f@mail.gmail.com> Message-ID: <855401.415.qm@web32002.mail.mud.yahoo.com> > S4. An I/O error (such as Connection reset by peer or > Broken pipe) > occurs when writing to the write() callable returned by > start_response(). Interesting. I had totally missed the write() callable return value required by start_response. If an application returns an iterable response and *also* calls the write()... what is supposed to happen? Yikes. This may require some careful adjustments to WHIFF. I had run into this and hacked around it on an ad hoc basis assuming it was a mistake. -- Aaron Watters === less is more From bchesneau at gmail.com Sun Jan 3 23:03:51 2010 From: bchesneau at gmail.com (Benoit Chesneau) Date: Sun, 3 Jan 2010 23:03:51 +0100 Subject: [Web-SIG] gunicorn 0.1 - new WSGI HTTP Server Message-ID: Hi, Quick mail to announce the gunicorn 'Green Unicorn' 0.1. it is a WSGI HTTP Server for UNIX, fast clients and nothing else. This is a port of Unicorn (http://unicorn.bogomips.org/) in Python. You can find it here : http://pypi.python.org/pypi/gunicorn/0.1 Current features are limited to the choice of number of workerts/cores you want to use and the ip/port. There are one command gunicorn_django that allows you to launch any django project. Any feedback is appreciated, - Benoit From arw1961 at yahoo.com Mon Jan 4 17:42:00 2010 From: arw1961 at yahoo.com (Aaron Watters) Date: Mon, 4 Jan 2010 08:42:00 -0800 (PST) Subject: [Web-SIG] wsgi write=start_response() and iterable return? In-Reply-To: <855401.415.qm@web32002.mail.mud.yahoo.com> Message-ID: <763604.20162.qm@web32007.mail.mud.yahoo.com> > From: Aaron Watters > .... > If an application returns an iterable response and *also* > calls the write()... what is supposed to happen?? After carefully considering all the responses on this issue ;c) I came up with the following strategy for dealing with calls to write() in combination with an iterable response: see http://listtree.appspot.com/listtreeNotes/qFxCJOYB2xkf2vyQS5L$AA This wrapper implementation diverts calls to write() into the iterable response so the rest of the system can ignore the write() function(). I'd be very happy if some of you would take a quick look and see if this makes sense to you. Thanks in advance, -- Aaron Watters === less is more From pje at telecommunity.com Mon Jan 4 22:38:15 2010 From: pje at telecommunity.com (P.J. Eby) Date: Mon, 04 Jan 2010 16:38:15 -0500 Subject: [Web-SIG] wsgi write=start_response() and iterable return? In-Reply-To: <763604.20162.qm@web32007.mail.mud.yahoo.com> References: <855401.415.qm@web32002.mail.mud.yahoo.com> <763604.20162.qm@web32007.mail.mud.yahoo.com> Message-ID: <20100104213820.2FAD33A4100@sparrow.telecommunity.com> At 08:42 AM 1/4/2010 -0800, Aaron Watters wrote: > > From: Aaron Watters > > .... > > If an application returns an iterable response and *also* > > calls the write()... what is supposed to happen? > >After carefully considering all the responses on this issue ;c) >I came up with the following strategy for dealing with calls to >write() in combination with an iterable response: see > > http://listtree.appspot.com/listtreeNotes/qFxCJOYB2xkf2vyQS5L$AA > >This wrapper implementation diverts calls to write() into the iterable >response so the rest of the system can ignore the write() >function(). I'd be very happy if some of you would take a quick >look and see if this makes sense to you. Do note that an application which calls write() from an iterator body is *not* WSGI compliant, as described under: http://www.python.org/dev/peps/pep-0333/#the-write-callable """Applications MUST NOT invoke write() from within their return iterable, and therefore any strings yielded by the iterable are transmitted after all strings passed to write() have been sent to the client.""" In practice, however, wsgiref.handlers treats write() and yield as interchangeable, and wsgiref.validate doesn't complain if an application calls write() from within an iteration.. :-( From me at gustavonarea.net Tue Jan 5 00:31:24 2010 From: me at gustavonarea.net (Gustavo Narea) Date: Mon, 4 Jan 2010 23:31:24 +0000 Subject: [Web-SIG] wsgiorg.routing_path addition to the wsgiorg.routing_args Specification Message-ID: <201001042331.24729.me@gustavonarea.net> Hello everybody. The current wsgiorg.routing_args specification requires that "Portions of the path that have been parsed should still be moved to SCRIPT_NAME (and removed from PATH_INFO)", but: 1.- That's against semantics. According to PEP 333 and the CGI spec, SCRIPT_NAME and PATH_INFO must represent the path where the (WSGI) application is "mounted" and the location of the request's target, respectively. 2.- It's not possible to reconstruct URLs reliably. After these variables have been modified, any attempt to reconstruct the home page's URL will be erroneous, for example. 3.- PATH_INFO will end up useless in many requests. For example, if a request matches the pattern "/posts/{article_title}/", these variables would have the following values: SCRIPT_NAME = "/blog/posts/hello-world" PATH_INFO = "/" I understand the reasoning behind a "cleaner" path, but I think taking data out of the PATH_INFO is not the best approach. Even if we only remove the matches alone, retaining the characters in between (instead of taking everything up to the last position of the match), we'd only be solving the third problem. So I'd like to propose the introduction of a new variable in the WSGI environment, wsgiorg.routing_path, which would be the PATH_INFO with all the arguments removed. Dispatchers would not have to modify SCRIPT_NAME or PATH_INFO. Instead, they should: 1.- Take the arguments from PATH_INFO and put them into wsgiorg.routing_args (as they do now). 2.- Store the PATH_INFO without arguments in wsgiorg.routing_path. Example 1 --------- Pattern = "/posts/{article_title}/" PATH_INFO = "/posts/hello-world/" wsgiorg.routing_args = ((), {'article_title': "hello-world"}) wsgiorg.routing_path = "/posts/" Example 2 --------- Pattern = "/posts/{article_title}/edit" PATH_INFO = "/posts/hello-world/edit" wsgiorg.routing_args = ((), {'article_title': "hello-world"}) wsgiorg.routing_path = "/posts/edit" This information would be useful in a number of situations, such as: 1.- An authorization framework could allow developers to write access controls based on the arguments-free path (i.e., wsgiorg.routing_path) and then use the arguments (in wsgiorg.routing_args) for more specific controls (if any). 2.- Templates can change automatically depending on the arguments-free path. .. which are not possible at present. What do you think about this? Cheers. -- Gustavo Narea . | Tech blog: =Gustavo/(+blog)/tech ~ About me: =Gustavo/about | From arw1961 at yahoo.com Tue Jan 5 15:01:06 2010 From: arw1961 at yahoo.com (Aaron Watters) Date: Tue, 5 Jan 2010 06:01:06 -0800 (PST) Subject: [Web-SIG] wsgi write=start_response() and iterable return? In-Reply-To: <20100104213820.2FAD33A4100@sparrow.telecommunity.com> Message-ID: <39143.90569.qm@web32004.mail.mud.yahoo.com> --- On Mon, 1/4/10, P.J. Eby wrote: > From: P.J. Eby > Subject: Re: [Web-SIG] wsgi write=start_response() and iterable return? > To: "Aaron Watters" , web-sig at python.org > Date: Monday, January 4, 2010, 4:38 PM > At 08:42 AM 1/4/2010 -0800, Aaron > Watters wrote: > > > > From: Aaron Watters > > > .... > > > If an application returns an iterable response > and *also* > > > calls the write()... what is supposed to happen? > > > >After carefully considering all the responses on this > issue ;c) > >I came up with the following strategy for dealing with > calls to > >write() in combination with an iterable response:? > see > > > >? http://listtree.appspot.com/listtreeNotes/qFxCJOYB2xkf2vyQS5L$AA > > > >This wrapper implementation diverts calls to write() > into the iterable > >response so the rest of the system can ignore the > write() > >function().? I'd be very happy if some of you > would take a quick > >look and see if this makes sense to you. > > Do note that an application which calls write() from an > iterator body > is *not* WSGI compliant, as described under: > .... > In practice, however, wsgiref.handlers treats write() and > yield as > interchangeable, and wsgiref.validate doesn't complain if > an > application calls write() from within an iteration..? > :-( ... And I'm not sure that a complicated program might not do this even it was not intended if it uses both idioms. In fact since WHIFF is designed to combine external components you probably can confuse WHIFF (or other intrastructure tools) into mixing the modes even if the tool doesn't mix modes directly. I'd like to see the write() callable removed from future versions of WSGI, and wrappers like the one I referenced could provide backwards compatibility for old style apps. -- Aaron Watters === less is more From tseaver at palladion.com Wed Jan 6 02:12:39 2010 From: tseaver at palladion.com (Tres Seaver) Date: Tue, 05 Jan 2010 20:12:39 -0500 Subject: [Web-SIG] gunicorn 0.1 - new WSGI HTTP Server In-Reply-To: References: Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Benoit Chesneau wrote: > Hi, > > Quick mail to announce the gunicorn 'Green Unicorn' 0.1. it is a WSGI > HTTP Server for UNIX, fast clients and nothing else. This is a port of > Unicorn (http://unicorn.bogomips.org/) in Python. > > You can find it here : > > http://pypi.python.org/pypi/gunicorn/0.1 > > Current features are limited to the choice of number of workerts/cores > you want to use and the ip/port. There are one command gunicorn_django > that allows you to launch any django project. > > Any feedback is appreciated, Interesting: how are you detecting slow clients in production, given that the WSGI server itself is only supposed to be used for fast ones? I'm assuming that there must be some kind of heuristic-applying proxy you run in front of unicorn/gunicorn. Or do you just not bother, and let slow clients see failed responses? Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAktD44cACgkQ+gerLs4ltQ5FcQCdHB2GL4mnB1I7Pr+e6iT/86Ct N68An2S9HfB9Fp4bFd+oeoJzboiXQVfF =1HOW -----END PGP SIGNATURE----- From me at gustavonarea.net Wed Jan 6 11:42:17 2010 From: me at gustavonarea.net (Gustavo Narea) Date: Wed, 6 Jan 2010 10:42:17 +0000 Subject: [Web-SIG] wsgiorg.routing_path addition to the wsgiorg.routing_args Specification In-Reply-To: <201001042331.24729.me@gustavonarea.net> References: <201001042331.24729.me@gustavonarea.net> Message-ID: <7022d0871001060242q5e70b567uebb86bd5d019e2b1@mail.gmail.com> Is it a really bad suggestion? :( - G. On Mon, Jan 4, 2010 at 11:31 PM, Gustavo Narea wrote: > Hello everybody. > > The current wsgiorg.routing_args specification requires that "Portions of > the > path that have been parsed should still be moved to SCRIPT_NAME (and > removed > from PATH_INFO)", but: > > 1.- That's against semantics. According to PEP 333 and the CGI spec, > SCRIPT_NAME and PATH_INFO must represent the path where the (WSGI) > application > is "mounted" and the location of the request's target, respectively. > 2.- It's not possible to reconstruct URLs reliably. After these variables > have been modified, any attempt to reconstruct the home page's URL will be > erroneous, for example. > 3.- PATH_INFO will end up useless in many requests. For example, if a > request > matches the pattern "/posts/{article_title}/", these variables would have > the > following values: > SCRIPT_NAME = "/blog/posts/hello-world" > PATH_INFO = "/" > > I understand the reasoning behind a "cleaner" path, but I think taking data > out of the PATH_INFO is not the best approach. Even if we only remove the > matches alone, retaining the characters in between (instead of taking > everything up to the last position of the match), we'd only be solving the > third problem. > > So I'd like to propose the introduction of a new variable in the WSGI > environment, wsgiorg.routing_path, which would be the PATH_INFO with all > the > arguments removed. > > Dispatchers would not have to modify SCRIPT_NAME or PATH_INFO. Instead, > they > should: > 1.- Take the arguments from PATH_INFO and put them into > wsgiorg.routing_args > (as they do now). > 2.- Store the PATH_INFO without arguments in wsgiorg.routing_path. > > Example 1 > --------- > Pattern = "/posts/{article_title}/" > PATH_INFO = "/posts/hello-world/" > wsgiorg.routing_args = ((), {'article_title': "hello-world"}) > wsgiorg.routing_path = "/posts/" > > Example 2 > --------- > Pattern = "/posts/{article_title}/edit" > PATH_INFO = "/posts/hello-world/edit" > wsgiorg.routing_args = ((), {'article_title': "hello-world"}) > wsgiorg.routing_path = "/posts/edit" > > This information would be useful in a number of situations, such as: > > 1.- An authorization framework could allow developers to write access > controls based on the arguments-free path (i.e., wsgiorg.routing_path) and > then use the arguments (in wsgiorg.routing_args) for more specific controls > (if any). > 2.- Templates can change automatically depending on the arguments-free > path. > > .. which are not possible at present. > > What do you think about this? > > Cheers. > -- > Gustavo Narea . > | Tech blog: =Gustavo/(+blog)/tech ~ About me: =Gustavo/about | > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arw1961 at yahoo.com Thu Jan 7 17:03:20 2010 From: arw1961 at yahoo.com (Aaron Watters) Date: Thu, 7 Jan 2010 08:03:20 -0800 (PST) Subject: [Web-SIG] wsgiorg.routing_path addition to the wsgiorg.routing_args Specification In-Reply-To: <7022d0871001060242q5e70b567uebb86bd5d019e2b1@mail.gmail.com> Message-ID: <384555.32480.qm@web32008.mail.mud.yahoo.com> I had to implement something like this for WHIFF. I think path dispatch considerations do not belong at the level of the WSGI spec. Higher level layers should worry about exactly how the URL gets dispatched within the application. The higher layers can add environment entries as needed, like "whiff.entry_point" and "whiff.template_path" etc. Or maybe I misunderstand something. -- Aaron Watters --- On Wed, 1/6/10, Gustavo Narea wrote: > From: Gustavo Narea > Subject: Re: [Web-SIG] wsgiorg.routing_path addition to the wsgiorg.routing_args Specification > To: web-sig at python.org > Date: Wednesday, January 6, 2010, 5:42 AM > Is it a really bad suggestion? :( > > ?- G. > > On Mon, Jan 4, 2010 at 11:31 PM, > Gustavo Narea > wrote: > > Hello everybody. > > > > The current wsgiorg.routing_args specification requires > that "Portions of the > > path that have been parsed should still be moved to > SCRIPT_NAME (and removed > > from PATH_INFO)", but: > > > > ?1.- That's against semantics. According to PEP 333 > and the CGI spec, > > SCRIPT_NAME and PATH_INFO must represent the path where the > (WSGI) application > > is "mounted" and the location of the > request's target, respectively. > > ?2.- It's not possible to reconstruct URLs reliably. > After these variables > > have been modified, any attempt to reconstruct the home > page's URL will be > > erroneous, for example. > > ?3.- PATH_INFO will end up useless in many requests. For > example, if a request > > matches the pattern "/posts/{article_title}/", > these variables would have the > > following values: > > ?SCRIPT_NAME ?= "/blog/posts/hello-world" > > ?PATH_INFO = "/" > > > > I understand the reasoning behind a "cleaner" > path, but I think taking data > > out of the PATH_INFO is not the best approach. Even if we > only remove the > > matches alone, retaining the characters in between (instead > of taking > > everything up to the last position of the match), we'd > only be solving the > > third problem. > > > > So I'd like to propose the introduction of a new > variable in the WSGI > > environment, wsgiorg.routing_path, which would be the > PATH_INFO with all the > > arguments removed. > > > > Dispatchers would not have to modify SCRIPT_NAME or > PATH_INFO. Instead, they > > should: > > ?1.- Take the arguments from PATH_INFO and put them into > wsgiorg.routing_args > > (as they do now). > > ?2.- Store the PATH_INFO without arguments in > wsgiorg.routing_path. > > > > Example 1 > > --------- > > Pattern = "/posts/{article_title}/" > > PATH_INFO = "/posts/hello-world/" > > wsgiorg.routing_args = ((), {'article_title': > "hello-world"}) > > wsgiorg.routing_path = "/posts/" > > > > Example 2 > > --------- > > Pattern = "/posts/{article_title}/edit" > > PATH_INFO = "/posts/hello-world/edit" > > wsgiorg.routing_args = ((), {'article_title': > "hello-world"}) > > wsgiorg.routing_path = "/posts/edit" > > > > This information would be useful in a number of situations, > such as: > > > > ?1.- An authorization framework could allow developers to > write access > > controls based on the arguments-free path (i.e., > wsgiorg.routing_path) and > > then use the arguments (in wsgiorg.routing_args) for more > specific controls > > (if any). > > ?2.- Templates can change automatically depending on the > arguments-free path. > > > > .. which are not possible at present. > > > > What do you think about this? > > > > Cheers. > > -- > > Gustavo Narea . > > | Tech blog: =Gustavo/(+blog)/tech ?~ ?About me: > =Gustavo/about | > > > > > -----Inline Attachment Follows----- > > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: http://mail.python.org/mailman/options/web-sig/arw1961%40yahoo.com > From me at gustavonarea.net Thu Jan 7 20:49:09 2010 From: me at gustavonarea.net (Gustavo Narea) Date: Thu, 7 Jan 2010 19:49:09 +0000 Subject: [Web-SIG] wsgiorg.routing_path addition to the wsgiorg.routing_args Specification In-Reply-To: <384555.32480.qm@web32008.mail.mud.yahoo.com> References: <384555.32480.qm@web32008.mail.mud.yahoo.com> Message-ID: <201001071949.09150.me@gustavonarea.net> Hello, Aaron. Aaron said: > I think path dispatch considerations do not belong at the level > of the WSGI spec. Higher level layers should worry about > exactly how the URL gets dispatched within the application. I agree -- That's why I believe the wsgiorg.routing_args Specification is the right place. > The higher layers can add environment entries as needed, > like "whiff.entry_point" and "whiff.template_path" etc. Right, but then why not make it standard since it could be used by 3rd party libraries or your own application? What I am suggesting is a path which represents the location of the current request in the application hierarchically. PATH_INFO can't be used because it's potentially polluted with arguments. Having this location can be useful in many situations. In addition to the examples I mentioned earlier, imagine a navigation system library: It would be able to tell where in the "navigation tree" you are right now, so you can generate breadcrumbs or just highlight the current link, among other things. This doesn't have to be framework-specific. And the dispatcher is the best component in the application to handle it. We could be a bit less strict about where to get the path from. Instead of taking the PATH_INFO and removing the arguments, Routes, for example, could merge the `controller' and `action' variables into a path string (e.g., "/controller1/subcontroller/action"). Cheers. -- Gustavo Narea . | Tech blog: =Gustavo/(+blog)/tech ~ About me: =Gustavo/about | From bchesneau at gmail.com Thu Jan 21 16:13:38 2010 From: bchesneau at gmail.com (Benoit Chesneau) Date: Thu, 21 Jan 2010 16:13:38 +0100 Subject: [Web-SIG] Fwd: gunicorn 0.1 - new WSGI HTTP Server In-Reply-To: References: Message-ID: ---------- Forwarded message ---------- From: Benoit Chesneau Date: Thu, Jan 21, 2010 at 4:12 PM Subject: Re: [Web-SIG] gunicorn 0.1 - new WSGI HTTP Server To: Tres Seaver On Wed, Jan 6, 2010 at 2:12 AM, Tres Seaver wrote: > Interesting: ?how are you detecting slow clients in production, given > that the WSGI server itself is only supposed to be used for fast ones? > I'm assuming that there must be some kind of heuristic-applying proxy > you run in front of unicorn/gunicorn. ?Or do you just not bother, and > let slow clients see failed responses? > > > Tres. Sorry for the delay to answer. Gunicorn intended to be used behind a caching upstream proxy like nginx. So it will be the responsability of nginx or such to manage slow connections which it does perfectly. In this sense it works like its alter ego unicorn on ruby. hope it helps. - beno?t From manlio_perillo at libero.it Tue Jan 26 13:22:01 2010 From: manlio_perillo at libero.it (Manlio Perillo) Date: Tue, 26 Jan 2010 13:22:01 +0100 Subject: [Web-SIG] host_name and request_uri_path Message-ID: <4B5EDE69.1010504@libero.it> Hi. Recently I have implemented these two functions: http://paste.pocoo.org/show/170198/ I would like to know if it is worth to have them as a saparate functions or if there is a better method to get the host name and the request URI path. About the host_name function, what is the reason why it is not included in wsgiref? Thanks Manlio