From faassen at startifact.com Wed Sep 3 12:37:15 2014 From: faassen at startifact.com (Martijn Faassen) Date: Wed, 3 Sep 2014 12:37:15 +0200 Subject: [Web-SIG] Morepath 0.5.1 released Message-ID: Hi there, I thought I'd also send an announcement about Morepath to the Python web sig. Morepath is a Python web framework geared for creating modern rich-client web applications. It uses routing to model which allows for easy link generation and greater code reuse. It also has features for inheriting and composing applications. Morepath is a micro-framework in that it's not a lot of lines of code, but it packs a lot of power in a small package. http://blog.startifact.com/posts/morepath-051-and-friends-released.html Morepath is extensively documented, here: http://morepath.readthedocs.org Regards, Martijn From iwan at reahl.org Thu Sep 11 09:14:13 2014 From: iwan at reahl.org (Iwan Vosloo) Date: Thu, 11 Sep 2014 09:14:13 +0200 Subject: [Web-SIG] Reahl 3.0.0 released Message-ID: <54114BC5.3030605@reahl.org> Hello, We have released Reahl 3.0.0. * Features: http://www.reahl.org * Installation: http://www.reahl.org/docs/3.0/tutorial/gettingstarted.d.html This release supports Python 3. In order to do that, we had to move off Elixir and now use Declarative instead. Elixir is still supported by using a mixture of 3.0.0 and 2.1.2 packages if need be. A number of other small changes also slipped in. Please see all the details (and upgrade and migration instructions) at: http://www.reahl.org/docs/3.0/whatchanged.d.html Reahl is a web application framework for Python programmers. With Reahl, programming is done purely in Python, using concepts familiar from GUI programming - like reusable Widgets and Events. There's no need for a programmer to know several different languages (HTML, JavaScript, template languages, etc) or to keep up with the tricks of these trades. The abstractions presented by Reahl relieve the programmer from the burden of dealing with the annoying problems of the web: security, accessibility, progressive enhancement (or graceful degradation) and browser quirks. Although a Reahl program benefits from having JavaScript available, it functions in the absence of JavaScript too. Search engines can crawl a Reahl program, and its pages can be bookmarked by browsers. Regards - Iwan -- Reahl, the Python only web framework: http://www.reahl.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From cmawebsite at gmail.com Wed Sep 10 20:41:45 2014 From: cmawebsite at gmail.com (Collin Anderson) Date: Wed, 10 Sep 2014 14:41:45 -0400 Subject: [Web-SIG] REMOTE_ADDR and proxys Message-ID: Hi All, The CGI spec says: Script authors should be aware that the REMOTE_ADDR and REMOTE_HOST meta-variables (see sections 4.1.8 and 4.1.9) may not identify the ultimate source of the request. They identify the client for the immediate request to the server; that client may be a proxy, gateway, or other intermediary acting on behalf of the actual source client. However, if the there is a revere proxy on the server side (such as nginx), it seems to me, the ip address of the "immediate request to the server" will be "127.0.0.1" and the actual address will be in an "X-Forwarded-For" header. It seems to me, it is the role of the server/gateway, not the application/framework to determine the "correct" client ip address and correctly account for the situation of being behind a known proxy. Also, I am aware of the security issues of improperly handling X-Forwarded-For, but that's an issue no matter where it's being handled. So, in the case of a reverse proxy, is it ok if the WSGI server sends back a REMOTE_ADDR that isn't 127.0.0.1, even if it's the immediate connection to the WSGI server is local? Basically can we interpret the "server" above to be the machine rather than the program? Thanks, Collin From robertc at robertcollins.net Sat Sep 13 20:40:47 2014 From: robertc at robertcollins.net (Robert Collins) Date: Sun, 14 Sep 2014 06:40:47 +1200 Subject: [Web-SIG] WSGI for HTTP/2.0 ? Message-ID: So HTTP/2.0 (http://http2.github.io/http2-spec/index.html) is far advanced, and my puny google-fu cannot find any upstream work on making a) updating and or replacing WSGI to support HTTP/2's new capabilities or b) an HTTP/2 capable SimplerServer or similar reference server in the standard library . Huge apologies if I'm wrong and pointers accepted! I did find things like https://evonove.it/blog/2012/django-jetty-spdy-blazing-fast/ which uses Jython and the Jetty web server to do SPDY (the Google experiment that has formed much of the basis of HTTP/2) or https://github.com/tatsuhiro-t/nghttp2/blob/master/python/wsgi.py which doesn't expose any of the new HTTP/2 features. So, I'd like to kick of such work, I think the spec is sufficiently stable now that we can design APIs in Python for it with confidence, even though we may need to tweak things it won't be disruptive. Specific things that I think we need to cater for: - the streaming and multiplexing facilities (http://http2.github.io/http2-spec/index.html#rfc.section.5 and http://http2.github.io/http2-spec/index.html#FrameTypes) - this is a fairly fundamental departure from HTTP/1.x's strict 'request-response' model and exposing it should offer very nice capabilities to site authors. HTTP1.x requests look like a half-closed stream on an HTTP/2 connection, but its entirely possible via the extension mechanism to run bidirectional data on a stream initiated by either end (whiile the client has a single open stream the server can push a new associated stream at any point) - flow control (http://http2.github.io/http2-spec/index.html#fc-principles) - for file uploads for instance, we can now rate limit single clients directly within the protocol - the resource tree (http://http2.github.io/http2-spec/index.html#pri-depend) - if we have concurrent requests being handled for one client it is now possible to explicitly model which ones should be processed and put on the wire first, and this should flow up into the application to a degree - GOAWAY (http://http2.github.io/http2-spec/index.html#ConnectionErrorHandler) - backwards compat - making sure that straight PEP-3333 apps still work well when the server connection is HTTP/2 Is anyone interested in collaborating on an update to WSGI to support HTTP/2's new features? -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From guido at python.org Mon Sep 15 19:20:15 2014 From: guido at python.org (Guido van Rossum) Date: Mon, 15 Sep 2014 10:20:15 -0700 Subject: [Web-SIG] Flurry of old posts appearing Message-ID: Today (Sept 15, 2014) I just received several posts on web-sig that were apparently first posted in May. An example is a post by mouad ben with subject "Connection close when response is ready". Did some kind of moderator queue just get unplugged? Or should I assume these were held up by GMail? -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Sep 15 20:12:32 2014 From: guido at python.org (Guido van Rossum) Date: Mon, 15 Sep 2014 11:12:32 -0700 Subject: [Web-SIG] Flurry of old posts appearing In-Reply-To: <32522.1410803255@parc.com> References: <32522.1410803255@parc.com> Message-ID: Wow. May I suggest asking for some new moderators? I understand the need to moderate posts (to prevent spam) but this isn't exactly encouraging to new contributors to the community... On Mon, Sep 15, 2014 at 10:47 AM, Bill Janssen wrote: > Guido van Rossum wrote: > > > Did some kind of moderator queue just get unplugged? > > Yep. First-time posts by new members. > > Bill > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From bill at janssen.org Mon Sep 15 20:56:19 2014 From: bill at janssen.org (bill at janssen.org) Date: Mon, 15 Sep 2014 11:56:19 -0700 Subject: [Web-SIG] more moderators for Web-SIG list? Message-ID: <2451.1410807379@parc.com> Anyone want to help moderate the Web-SIG list? Very low activity level. I'd just like to get some more people who can manipulate the levers. Bill From sven at berkvens.net Tue Sep 16 07:16:00 2014 From: sven at berkvens.net (Sven Berkvens-Matthijsse) Date: Tue, 16 Sep 2014 07:16:00 +0200 Subject: [Web-SIG] more moderators for Web-SIG list? In-Reply-To: <2451.1410807379@parc.com> References: <2451.1410807379@parc.com> Message-ID: <5417C790.5010209@berkvens.net> Hi Bill, Bill Janssen wrote on 2014-09-15 20:56: > Anyone want to help moderate the Web-SIG list? Very low activity level. > I'd just like to get some more people who can manipulate the levers. Sure, why not... > Bill Sven From cory at lukasa.co.uk Tue Sep 16 11:43:54 2014 From: cory at lukasa.co.uk (Cory Benfield) Date: Tue, 16 Sep 2014 10:43:54 +0100 Subject: [Web-SIG] WSGI for HTTP/2.0 ? In-Reply-To: References: Message-ID: On 13 September 2014 19:40, Robert Collins wrote: > Is anyone interested in collaborating on an update to WSGI to support > HTTP/2's new features? I'd be happy to help. I know extremely little about WSGI (though I'm sure I can read up on it), but I'm pretty heavily involved with HTTP/2 (as you know!) so I think I can provide some value. Cory From graffatcolmingov at gmail.com Tue Sep 16 16:33:53 2014 From: graffatcolmingov at gmail.com (Ian Cordasco) Date: Tue, 16 Sep 2014 09:33:53 -0500 Subject: [Web-SIG] more moderators for Web-SIG list? Message-ID: Hey Bill, I already moderate code-quality at python.org and would be happy to lend a hand as well. Cheers, Ian From graffatcolmingov at gmail.com Tue Sep 16 16:34:41 2014 From: graffatcolmingov at gmail.com (Ian Cordasco) Date: Tue, 16 Sep 2014 09:34:41 -0500 Subject: [Web-SIG] WSGI for HTTP/2.0 ? Message-ID: I'm also up to help with this. I haven't been very involved in HTTP/2 or WSGI, but I'm happy to help with both. From roberto at unbit.it Sat Sep 20 07:49:38 2014 From: roberto at unbit.it (Roberto De Ioris) Date: Sat, 20 Sep 2014 07:49:38 +0200 Subject: [Web-SIG] WSGI for HTTP/2.0 ? In-Reply-To: References: Message-ID: <01fee57de2c7a2a40d19097b52360587.squirrel@manage.unbit.it> > So HTTP/2.0 (http://http2.github.io/http2-spec/index.html) is far > advanced, and my puny google-fu cannot find any upstream work on > making a) updating and or replacing WSGI to support HTTP/2's new > capabilities or b) an HTTP/2 capable SimplerServer or similar > reference server in the standard library . Huge apologies if I'm wrong > and pointers accepted! > > I did find things like > https://evonove.it/blog/2012/django-jetty-spdy-blazing-fast/ which > uses Jython and the Jetty web server to do SPDY (the Google experiment > that has formed much of the basis of HTTP/2) or > https://github.com/tatsuhiro-t/nghttp2/blob/master/python/wsgi.py > which doesn't expose any of the new HTTP/2 features. > > So, I'd like to kick of such work, I think the spec is sufficiently > stable now that we can design APIs in Python for it with confidence, > even though we may need to tweak things it won't be disruptive. > > Specific things that I think we need to cater for: > - the streaming and multiplexing facilities > (http://http2.github.io/http2-spec/index.html#rfc.section.5 and > http://http2.github.io/http2-spec/index.html#FrameTypes) - this is a > fairly fundamental departure from HTTP/1.x's strict 'request-response' > model and exposing it should offer very nice capabilities to site > authors. HTTP1.x requests look like a half-closed stream on an HTTP/2 > connection, but its entirely possible via the extension mechanism to > run bidirectional data on a stream initiated by either end (whiile the > client has a single open stream the server can push a new associated > stream at any point) > - flow control > (http://http2.github.io/http2-spec/index.html#fc-principles) - for > file uploads for instance, we can now rate limit single clients > directly within the protocol > - the resource tree > (http://http2.github.io/http2-spec/index.html#pri-depend) - if we have > concurrent requests being handled for one client it is now possible to > explicitly model which ones should be processed and put on the wire > first, and this should flow up into the application to a degree > - GOAWAY > (http://http2.github.io/http2-spec/index.html#ConnectionErrorHandler) > - backwards compat - making sure that straight PEP-3333 apps still > work well when the server connection is HTTP/2 > > Is anyone interested in collaborating on an update to WSGI to support > HTTP/2's new features? > > -Rob > > I can help a bit (i am the uWSGI lead developer and a nginx and Cherokee contributor, and i have already implemented a spdy3 server last year) I honestly think that WSGI by itself needs a complete rewrite/rethink to be adapted to modern (ok someone could say 'fashioned') patterns (that are somewhat more 'urgent' than HTTP/2), but i agree that starting thinking about HTTP/2 could be a good thing. -- Roberto De Ioris http://unbit.it From graham.dumpleton at gmail.com Sat Sep 20 08:31:21 2014 From: graham.dumpleton at gmail.com (Graham Dumpleton) Date: Sat, 20 Sep 2014 16:31:21 +1000 Subject: [Web-SIG] WSGI for HTTP/2.0 ? In-Reply-To: <01fee57de2c7a2a40d19097b52360587.squirrel@manage.unbit.it> References: <01fee57de2c7a2a40d19097b52360587.squirrel@manage.unbit.it> Message-ID: <82184701-EA0C-4A68-BF46-E7EE4D1D0BB0@gmail.com> On 20/09/2014, at 3:49 PM, Roberto De Ioris wrote: > I can help a bit (i am the uWSGI lead developer and a nginx and Cherokee > contributor, and i have already implemented a spdy3 server last year) > > I honestly think that WSGI by itself needs a complete rewrite/rethink to > be adapted to modern (ok someone could say 'fashioned') patterns (that are > somewhat more 'urgent' than HTTP/2), but i agree that starting thinking > about HTTP/2 could be a good thing. I agree. Overhauling WSGI has more relevance because an underlying web server updating itself to support HTTP 2.0 will in the main have little relevance at the application layer as the web server is more than likely to have an adapter layer which makes things look the same to existing modules/protocol adapters. In other words, Apache adding support for HTTP 2.0 isn't going to result in some sort of wholesale change of the Apache module interface, it would stay the same say whether HTTP 2.0 is used, especially just as an alternate way of doing the same thing as HTTP 1.1. In that respect, since no HTTP 2.0 specific functionality is going to be made visible through exist interfaces, then Apache modules or adapters for FASTCGI/SCGI etc or even mod_wsgi are simply not going to change. So, overhaul WSGI as the primary aim, but within that factor in things to allow for HTTP 2.0 functionality. The problem with trying to overhaul WSGI is that if it is done in an open forum like the Web-SIG it will die of a thousand cuts, as past efforts to update it in even minor ways have suffered. The only way that WSGI itself will ever see an overhaul is through the strong willed determination of a few people off list, out of sight, to allow it it to be fully fleshed out, with input coming from direct consultation with and review by other related parties who have a vested interested or significant experience in the area. I may be up for such an off list effort, but be warned I may want to run roughshod over it and exert quite a lot of influence over the process and outcome. :-) Graham From bchesneau at gmail.com Sat Sep 20 09:14:13 2014 From: bchesneau at gmail.com (Benoit Chesneau) Date: Sat, 20 Sep 2014 09:14:13 +0200 Subject: [Web-SIG] WSGI for HTTP/2.0 ? In-Reply-To: <82184701-EA0C-4A68-BF46-E7EE4D1D0BB0@gmail.com> References: <01fee57de2c7a2a40d19097b52360587.squirrel@manage.unbit.it> <82184701-EA0C-4A68-BF46-E7EE4D1D0BB0@gmail.com> Message-ID: Hi, I would prefer to have this work being done transparently. If we do it rationally it could work imo. Anyway before thinking to change the protocol or criticizing it maybe we could first collect the requirements in HTTP 2 (stream and such) so we can think about possible implementations. And see what it misses in WSGI. I am thinking we could adopt the same path used to decided to go for HTTP 1.x or HTTP 2 on the client part. Ie keeping WSGI and PEP 3333 for HTTP 1.1 applications and go for a new interface in HTTP2. But such decision should be done once we have a clear view of what requires HTTP 2 and how it can be handled on the python side. Thoughts? - benoit On Sat, Sep 20, 2014 at 8:31 AM, Graham Dumpleton < graham.dumpleton at gmail.com> wrote: > > On 20/09/2014, at 3:49 PM, Roberto De Ioris wrote: > > > I can help a bit (i am the uWSGI lead developer and a nginx and Cherokee > > contributor, and i have already implemented a spdy3 server last year) > > > > I honestly think that WSGI by itself needs a complete rewrite/rethink to > > be adapted to modern (ok someone could say 'fashioned') patterns (that > are > > somewhat more 'urgent' than HTTP/2), but i agree that starting thinking > > about HTTP/2 could be a good thing. > > I agree. > > Overhauling WSGI has more relevance because an underlying web server > updating itself to support HTTP 2.0 will in the main have little relevance > at the application layer as the web server is more than likely to have an > adapter layer which makes things look the same to existing modules/protocol > adapters. > > In other words, Apache adding support for HTTP 2.0 isn't going to result > in some sort of wholesale change of the Apache module interface, it would > stay the same say whether HTTP 2.0 is used, especially just as an alternate > way of doing the same thing as HTTP 1.1. In that respect, since no HTTP 2.0 > specific functionality is going to be made visible through exist > interfaces, then Apache modules or adapters for FASTCGI/SCGI etc or even > mod_wsgi are simply not going to change. > > So, overhaul WSGI as the primary aim, but within that factor in things to > allow for HTTP 2.0 functionality. > > The problem with trying to overhaul WSGI is that if it is done in an open > forum like the Web-SIG it will die of a thousand cuts, as past efforts to > update it in even minor ways have suffered. > > The only way that WSGI itself will ever see an overhaul is through the > strong willed determination of a few people off list, out of sight, to > allow it it to be fully fleshed out, with input coming from direct > consultation with and review by other related parties who have a vested > interested or significant experience in the area. > > I may be up for such an off list effort, but be warned I may want to run > roughshod over it and exert quite a lot of influence over the process and > outcome. :-) > > Graham > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: > https://mail.python.org/mailman/options/web-sig/bchesneau%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertc at robertcollins.net Sat Sep 20 09:23:40 2014 From: robertc at robertcollins.net (Robert Collins) Date: Sat, 20 Sep 2014 19:23:40 +1200 Subject: [Web-SIG] WSGI for HTTP/2.0 ? In-Reply-To: <82184701-EA0C-4A68-BF46-E7EE4D1D0BB0@gmail.com> References: <01fee57de2c7a2a40d19097b52360587.squirrel@manage.unbit.it> <82184701-EA0C-4A68-BF46-E7EE4D1D0BB0@gmail.com> Message-ID: On 20 September 2014 18:31, Graham Dumpleton wrote: > > On 20/09/2014, at 3:49 PM, Roberto De Ioris wrote: > >> I can help a bit (i am the uWSGI lead developer and a nginx and Cherokee >> contributor, and i have already implemented a spdy3 server last year) >> >> I honestly think that WSGI by itself needs a complete rewrite/rethink to >> be adapted to modern (ok someone could say 'fashioned') patterns (that are >> somewhat more 'urgent' than HTTP/2), but i agree that starting thinking >> about HTTP/2 could be a good thing. > > I agree. > > Overhauling WSGI has more relevance because an underlying web server updating itself to support HTTP 2.0 will in the main have little relevance at the application layer as the web server is more than likely to have an adapter layer which makes things look the same to existing modules/protocol adapters. I don't particular care what we call it - extending/overhauling/replacing. Lets call the new API WSGI2 as a shorthand for 'WSGI extended or overhauled to support things that are way overdue'. I think the key thiings we need are: - support HTTP/2 - still support HTTP/1.x - This is reasonable because many environments will want a single server accepting both HTTP/1.x and 2 at once with graceful degradation - support websockets - be backportable to Python2.7+ - be adoptable by uWSGI/mod_wsgi/weurkzeug/gunicorn/Twisted's thread-pool adapter etc - be approximately as easy to write middleware and glue for as WSGI has been - have a sane migration path for servers, middleware and apps: a big-bang approach isn't going to fly with the huge install base of WSGI - be able to write both forward and backwards compatibilty shims: that is, a) be able to write an adapter that exposes WSGI2 on the top and pure, compatible WSGI on the bottom b) be able to write an adapter that exposes WSGI on the top and a restricted WSGI2 on the bottom. (Where graceful degradation means behaving the same as if an HTTP/1.x connection was being handled). > In other words, Apache adding support for HTTP 2.0 isn't going to result in some sort of wholesale change of the Apache module interface, it would stay the same say whether HTTP 2.0 is used, especially just as an alternate way of doing the same thing as HTTP 1.1. In that respect, since no HTTP 2.0 specific functionality is going to be made visible through exist interfaces, then Apache modules or adapters for FASTCGI/SCGI etc or even mod_wsgi are simply not going to change. The Apache module interface is a lot wider than WSGI today, so I'm glad to hear you believe it won't need changing! WSGI has no equivalent for the CONNECTION filter, for instance (and arguably shouldn't because thats very much a server responsibility) - but we need something to hack websockets and push responses on. > So, overhaul WSGI as the primary aim, but within that factor in things to allow for HTTP 2.0 functionality. I consider HTTP/2 one of potentially several primary aims: I'm delighted to collaborate on overhauling WSGI, but if we overhaul it and don't *deliver* HTTP/2, then I think thats going to be bad - there is every opportunity for us to miss a fine detail and end up having to rev our spec as soon as implementors find issues. > The problem with trying to overhaul WSGI is that if it is done in an open forum like the Web-SIG it will die of a thousand cuts, as past efforts to update it in even minor ways have suffered. Well, thats certainly a challenge :). Whats the governance model here? Is a PEP appropriate, and if so - that gives us a BFDL or BFDL PEP-delegate to decide between bikeshed issues; and if its not a bikeshed issue then resolving it is actually necessary. > The only way that WSGI itself will ever see an overhaul is through the strong willed determination of a few people off list, out of sight, to allow it it to be fully fleshed out, with input coming from direct consultation with and review by other related parties who have a vested interested or significant experience in the area. > > I may be up for such an off list effort, but be warned I may want to run roughshod over it and exert quite a lot of influence over the process and outcome. :-) I will happily discuss stuff with you off-list, but I'm not particularly interested in having the primary effort be cabal style - HTTP/2 has managed to go through a much harder rev with very strong personalities and much the same sort of death possible as you're concerned about here with great transparency. I don' t think thats incompatible with your needs though - for instance, if you want to stay offlist and debate privately to avoid 1000-cut-pain : thats fine, but I reserve the right to summarise and discuss things here (and equally to ignore kibbitzing here that isn't being productive). As far as qualifications, I've a long history with HTTP (about 14 years now - I joined the Squid project circa 2000), applications that will benefit from it (horizon, the OpenStack APIs and lmirror), varied experience deploying WSGI and WSGI derived things, and writing code both inside frameworks like Django and Zope3, as well as straight to WSGI. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From robertc at robertcollins.net Sat Sep 20 09:53:25 2014 From: robertc at robertcollins.net (Robert Collins) Date: Sat, 20 Sep 2014 19:53:25 +1200 Subject: [Web-SIG] WSGI for HTTP/2.0 ? In-Reply-To: References: <01fee57de2c7a2a40d19097b52360587.squirrel@manage.unbit.it> <82184701-EA0C-4A68-BF46-E7EE4D1D0BB0@gmail.com> Message-ID: On 20 September 2014 19:14, Benoit Chesneau wrote: > Hi, > > I would prefer to have this work being done transparently. If we do it > rationally it could work imo. > > Anyway before thinking to change the protocol or criticizing it maybe we > could first collect the requirements in HTTP 2 (stream and such) so we can > think about possible implementations. And see what it misses in WSGI. > > I am thinking we could adopt the same path used to decided to go for HTTP > 1.x or HTTP 2 on the client part. Ie keeping WSGI and PEP 3333 for HTTP 1.1 > applications and go for a new interface in HTTP2. But such decision should > be done once we have a clear view of what requires HTTP 2 and how it can be > handled on the python side. > > Thoughts? +1 on transparency. Agree that before we consider what we need to change we need to set our goals up - thats basically the charter for a PEP: Here is a straw man in prose form: We want to create a clean common API for applications and middleware written in a post HTTP/2 world - where single servers may accept up to all three of HTTP/1.x, HTTP/2 and Websocket connections, and applications and middleware want to be able to take advantage of HTTP/2 and websockets when available, but also degrade gracefully. We also want to ensure that there is a graceful incremental path to adoption of the new API, including Python 2.7 support, and shims to enable existing WSGI apps/middleware/servers to respectively be contained, contain-or-be-contained and contain, things written to this new API. We want a clean, fast and approachable API, and we want to ensure that its no less friendly to work with than WSGI, for all that it will expose much more functionality. -Rob From cory at lukasa.co.uk Sat Sep 20 12:54:41 2014 From: cory at lukasa.co.uk (Cory Benfield) Date: Sat, 20 Sep 2014 11:54:41 +0100 Subject: [Web-SIG] WSGI for HTTP/2.0 ? In-Reply-To: References: <01fee57de2c7a2a40d19097b52360587.squirrel@manage.unbit.it> <82184701-EA0C-4A68-BF46-E7EE4D1D0BB0@gmail.com> Message-ID: On 20 September 2014 08:23, Robert Collins wrote: > I will happily discuss stuff with you off-list, but I'm not > particularly interested in having the primary effort be cabal style - > HTTP/2 has managed to go through a much harder rev with very strong > personalities and much the same sort of death possible as you're > concerned about here with great transparency. That's true...but it has been extremely painful for all concerned. There is minimal appetite left in the WG to continue with the work, and a number of people quite want to put a pin in things to just give themselves a break. I suspect this is what Graham is worried about, and as I recall he is speaking from bitter experience. > I don' t think thats incompatible with your needs though - for > instance, if you want to stay offlist and debate privately to avoid > 1000-cut-pain : thats fine, but I reserve the right to summarise and > discuss things here (and equally to ignore kibbitzing here that isn't > being productive). I'd like to propose instead something of a third way, inspired by the HTTPBis. A small cabal, off-list, come up with an initial proposal that is essentially complete (a la SPDY). That proposal should ideally have running code behind it so that it can be played with (also like SPDY). That proposal is then brought to the list for further refinement. This allows Graham to input early, and ensures open review of the proposal while those who don't want to participate in the fully-open forum can absent themselves from the discussion. Of course, I'd like to help regardless of the actual procedure we use. Cory From dirkjan at ochtman.nl Sat Sep 20 15:22:22 2014 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Sat, 20 Sep 2014 15:22:22 +0200 Subject: [Web-SIG] WSGI for HTTP/2.0 ? In-Reply-To: References: <01fee57de2c7a2a40d19097b52360587.squirrel@manage.unbit.it> <82184701-EA0C-4A68-BF46-E7EE4D1D0BB0@gmail.com> Message-ID: On Sat, Sep 20, 2014 at 9:23 AM, Robert Collins wrote: > Well, thats certainly a challenge :). Whats the governance model here? > Is a PEP appropriate, and if so - that gives us a BFDL or BFDL > PEP-delegate to decide between bikeshed issues; and if its not a > bikeshed issue then resolving it is actually necessary. Yes, I think a good way forward would be to have a small cabal write a PEP and then announce it here for further feedback and then pronouncement by a BDFL or -delegate. If you want to be lead/editor, that sounds great. It also seems like you should definitely involve Graham and give credence to his thoughts. I'd be excited about this and happy to give feedback a little later once you've got some initial draft, as someone who likes to implement his applications directly on top of WSGI for now (but I've also implemented a couple of WebSocket servers). Cheers, Dirkjan From bchesneau at gmail.com Sat Sep 20 16:17:32 2014 From: bchesneau at gmail.com (Benoit Chesneau) Date: Sat, 20 Sep 2014 16:17:32 +0200 Subject: [Web-SIG] WSGI for HTTP/2.0 ? In-Reply-To: References: <01fee57de2c7a2a40d19097b52360587.squirrel@manage.unbit.it> <82184701-EA0C-4A68-BF46-E7EE4D1D0BB0@gmail.com> Message-ID: On Sat, Sep 20, 2014 at 3:22 PM, Dirkjan Ochtman wrote: > On Sat, Sep 20, 2014 at 9:23 AM, Robert Collins > wrote: > > Well, thats certainly a challenge :). Whats the governance model here? > > Is a PEP appropriate, and if so - that gives us a BFDL or BFDL > > PEP-delegate to decide between bikeshed issues; and if its not a > > bikeshed issue then resolving it is actually necessary. > > Yes, I think a good way forward would be to have a small cabal write a > PEP and then announce it here for further feedback and then > pronouncement by a BDFL or -delegate. If you want to be lead/editor, > that sounds great. It also seems like you should definitely involve > Graham and give credence to his thoughts. > > I'd be excited about this and happy to give feedback a little later > once you've got some initial draft, as someone who likes to implement > his applications directly on top of WSGI for now (but I've also > implemented a couple of WebSocket servers). > > Cheers, > > Dirkjan > > Last time was more about everyone wanted to discuss about the changes with its own requirement list. Which is already conflicting. Discussing if it should be done outside the media designed for it is already out of topic imo. So I won't discuss about it further. Instead I wonder what is the appropriate medium to collect requirements and others stuffs about it. Wiki ? Anything else? For a start I see these different topics 1) HTTP 1.1 vs HTTP 2: - HTTP 1.1 and HTTP2 have quite the same high level syntax (methods, uri, headers, ...) but the way the data is transported differs. (data are sent by frames in HTTP 2). - in HTTP 2, data can be encrypted and compressed. - in HTTP2 data can pushed from the server to the clients. More data can be sent to the client - in HTTP2 streams are multiplexed We have the concept of data channels and these are more like message passings. Multiplexing existed in HTTP 1.1 with pipelines but is barely supported right now by WSGI servers. The concept of data channels and the PUSH features will requires more concurrency at the server level. At the application, things doesn't change that much. Everything can appear like before. The only change is the PUSH feature. 2) Websockets, SSE and other similar protocoles are completely asynchronous. All this part is not really handled by WSGI. The way it is generally implemented right now is awkward. The server generally extend the WSGI protocol so the application get the socket. Then a specific library handle the rest. I actually wonder if websockets or other asynchronous protocols should be handled by the new WSGI SPEC. Shouldn't we just standardize the way the socket is given to another library? Anyway I think we should collect all requirements at application and server level and then start to confront the current WSGI spec to them. And iterate. Thoughts? Any other topic? - benoit -------------- next part -------------- An HTML attachment was scrubbed... URL: From bchesneau at gmail.com Sat Sep 20 16:31:12 2014 From: bchesneau at gmail.com (Benoit Chesneau) Date: Sat, 20 Sep 2014 16:31:12 +0200 Subject: [Web-SIG] WSGI for HTTP/2.0 ? In-Reply-To: References: <01fee57de2c7a2a40d19097b52360587.squirrel@manage.unbit.it> <82184701-EA0C-4A68-BF46-E7EE4D1D0BB0@gmail.com> Message-ID: got an idea. What about having a page collecting feedback from anyone in the python community about this topic. So we can have true data from different perspectives: developer, library/framework author, server author. I'm OK to collect the data from it and make a summary of it once it's done. The form it could take should be discussed first but imo that a good way to engage the community. What do you think? On Sat, Sep 20, 2014 at 4:17 PM, Benoit Chesneau wrote: > > > On Sat, Sep 20, 2014 at 3:22 PM, Dirkjan Ochtman > wrote: > >> On Sat, Sep 20, 2014 at 9:23 AM, Robert Collins >> wrote: >> > Well, thats certainly a challenge :). Whats the governance model here? >> > Is a PEP appropriate, and if so - that gives us a BFDL or BFDL >> > PEP-delegate to decide between bikeshed issues; and if its not a >> > bikeshed issue then resolving it is actually necessary. >> >> Yes, I think a good way forward would be to have a small cabal write a >> PEP and then announce it here for further feedback and then >> pronouncement by a BDFL or -delegate. If you want to be lead/editor, >> that sounds great. It also seems like you should definitely involve >> Graham and give credence to his thoughts. >> >> I'd be excited about this and happy to give feedback a little later >> once you've got some initial draft, as someone who likes to implement >> his applications directly on top of WSGI for now (but I've also >> implemented a couple of WebSocket servers). >> >> Cheers, >> >> Dirkjan >> >> > Last time was more about everyone wanted to discuss about the changes with > its own requirement list. Which is already conflicting. Discussing if it > should be done outside the media designed for it is already out of topic > imo. So I won't discuss about it further. > > Instead I wonder what is the appropriate medium to collect requirements > and others stuffs about it. Wiki ? Anything else? > > For a start I see these different topics > > > 1) HTTP 1.1 vs HTTP 2: > > - HTTP 1.1 and HTTP2 have quite the same high level syntax (methods, uri, > headers, ...) but the way the data is transported differs. (data are sent > by frames in HTTP 2). > - in HTTP 2, data can be encrypted and compressed. > - in HTTP2 data can pushed from the server to the clients. More data can > be sent to the client > - in HTTP2 streams are multiplexed We have the concept of data channels > and these are more like message passings. Multiplexing existed in HTTP 1.1 > with pipelines but is barely supported right now by WSGI servers. > > > The concept of data channels and the PUSH features will requires more > concurrency at the server level. > > At the application, things doesn't change that much. Everything can appear > like before. The only change is the PUSH feature. > > > 2) Websockets, SSE and other similar protocoles are completely > asynchronous. All this part is not really handled by WSGI. The way it is > generally implemented right now is awkward. The server generally extend the > WSGI protocol so the application get the socket. Then a specific library > handle the rest. > > > I actually wonder if websockets or other asynchronous protocols should be > handled by the new WSGI SPEC. Shouldn't we just standardize the way the > socket is given to another library? > > > Anyway I think we should collect all requirements at application and > server level and then start to confront the current WSGI spec to them. And > iterate. Thoughts? Any other topic? > > - benoit > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cory at lukasa.co.uk Sat Sep 20 17:17:30 2014 From: cory at lukasa.co.uk (Cory Benfield) Date: Sat, 20 Sep 2014 16:17:30 +0100 Subject: [Web-SIG] WSGI for HTTP/2.0 ? In-Reply-To: References: <01fee57de2c7a2a40d19097b52360587.squirrel@manage.unbit.it> <82184701-EA0C-4A68-BF46-E7EE4D1D0BB0@gmail.com> Message-ID: On 20 September 2014 15:17, Benoit Chesneau wrote: > 1) HTTP 1.1 vs HTTP 2: > > - HTTP 1.1 and HTTP2 have quite the same high level syntax (methods, uri, > headers, ...) but the way the data is transported differs. (data are sent by > frames in HTTP 2). Yes, this is correct. *In principle*, much of the way WSGI transmits information can remain exactly the same, at least from the perspective of working with HTTP/2. > - in HTTP 2, data can be encrypted and compressed. Not by default. HTTP/2 DATA frame compression got removed from the draft spec in draft-13. It's currently available in a draft extension, but it'll only ever be an extension to HTTP/2. HTTP/2 encryption is just the same as for HTTP/1.1: TLS. > At the application, things doesn't change that much. Everything can appear > like before. The only change is the PUSH feature. Server Push is important, but I think you've missed some really key points in HTTP/2 that are potentially valuable to expose at the application level. Firstly, HPACK provides special provision for marking some headers as 'never index'[0]. This is for security reasons, and is intended to signal that no-one should keep that header value in their header tables. We may well want to expose this functionality. Secondly, HTTP/2 DATA frames can be padded. Assuming padding remains in the spec (not guaranteed), this is another security feature that we may want to expose to the application. (Exposing this to application is kinda stupid, but we can't leave it to the server because it won't know what to pad and what not to pad.) Thirdly, we need to remember that HTTP/2 streams are flow controlled. This requires the design to very carefully consider how a response blocked by flow control behaves. Fourthly, the multiplexing is *prioritised*. This priority information may need to be accessible to the application in order to make decisions based on it. Fifthly, while HTTP/2 is *able* to handle the standard HTTP/1.1 request-response cycle, it needn't be *limited* to it. In particular, long-polling works a whole lot better in HTTP/2 because of fact that stream lifetime is potentially unlimited. Similarly, because streams are bidirectional it may become popular to use HTTP/2 streams as ad-hoc websocket connections. These are all suggestions that we shouldn't necessarily cleave too closely to the current WSGI paradigm. I'm sure I've missed some other things as well. What I wanted to highlight is that HTTP/2 is a subtle, complex protocol that is much more powerful than the one it replaces. We should very carefully consider how we approach a new WSGI specification, because we're going to be stuck with it for the next few years. I do think the idea of collating feedback is a good one, however. [0]: https://tools.ietf.org/html/draft-ietf-httpbis-header-compression-09#section-7.2.3 From sh at defuze.org Sat Sep 20 18:27:51 2014 From: sh at defuze.org (Sylvain Hellegouarch) Date: Sat, 20 Sep 2014 18:27:51 +0200 Subject: [Web-SIG] WSGI for HTTP/2.0 ? In-Reply-To: References: <01fee57de2c7a2a40d19097b52360587.squirrel@manage.unbit.it> <82184701-EA0C-4A68-BF46-E7EE4D1D0BB0@gmail.com> Message-ID: Hi Benoit, > I actually wonder if websockets or other asynchronous protocols should be > handled by the new WSGI SPEC. Shouldn't we just standardize the way the > socket is given to another library? > Considering the websocket connection is initiated via a HTTP request, it would be a good idea that the newer WSGI interface keeps it in mind even if it doesn't address the protocol itself. As you say, the current state is akward at best. -- - Sylvain http://www.defuze.org http://twitter.com/lawouach -------------- next part -------------- An HTML attachment was scrubbed... URL: From randy at thesyrings.us Sat Sep 20 20:15:23 2014 From: randy at thesyrings.us (Randy Syring) Date: Sat, 20 Sep 2014 14:15:23 -0400 Subject: [Web-SIG] WSGI for HTTP/2.0 ? In-Reply-To: References: <01fee57de2c7a2a40d19097b52360587.squirrel@manage.unbit.it> <82184701-EA0C-4A68-BF46-E7EE4D1D0BB0@gmail.com> Message-ID: <541DC43B.7060208@thesyrings.us> On 09/20/2014 02:31 AM, Graham Dumpleton wrote: > The problem with trying to overhaul WSGI is that if it is done in an open forum like the Web-SIG it will die of a thousand cuts, as past efforts to update it in even minor ways have suffered. > > The only way that WSGI itself will ever see an overhaul is through the strong willed determination of a few people off list, out of sight, to allow it it to be fully fleshed out, with input coming from direct consultation with and review by other related parties who have a vested interested or significant experience in the area. > > I may be up for such an off list effort, but be warned I may want to run roughshod over it and exert quite a lot of influence over the process and outcome. I'm no one important in the Python world, but, FWIW, I agree with you. I've followed your work over the years and believe you have a penchant for details and accuracy as evidenced by your comments here on the list and your work on modwsgi and wrapt. I'd be very interested in seeing what you could come up with. IMO, if you are up for it, you should feel free to grab a few people that you would like to work with and hammer out a PEP (or it's precursor). Then, let the PEP process work as it's intended to. Hopefully, this method results in a trend towards more concrete and specific arguments and less likely to "die of a thousand cuts." I'm going to refer this group as the "draft team." On 20 September 2014 19:14, Benoit Chesneau wrote: > I would prefer to have this work being done transparently. If we do it > rationally it could work imo. I don't think anyone is arguing against transparency. But momentum matters and, in the history of changes to the WSGI spec, momentum has died pretty easily even when there were clearly changes that needed to be made. If Graham, or anyone else for that matter, has the gumption to go at this thing hard and get something written down, I think that that should be encouraged. Even if the initial phases of that processes are behind closed doors, transparency will come eventually and there will be opportunity for comment. But if you make the process too transparent too early, the energy used to keep up with everyone and all the different needs can take away from doing the actual work of defining the spec. > got an idea. What about having a page collecting feedback from anyone > in the python community about this topic. So we can have true data > from different perspectives: developer, library/framework author, > server author. I'm OK to collect the data from it and make a summary > of it once it's done. This seems reasonable. That way, interested parties could get their comments "on record" without the draft team needing to feel like that have to satisfy or even have a discussion about every comment. > > The form it could take should be discussed first but imo that a good > way to engage the community. What do you think? I'd suggest a "wsgi comments" github repo. Workflow: * Submit a document to the repo with your comments on the future version of WSGI o use any readable format you want (Markdown, RST, plain text, etc.). o include name, contact information, background. Make sure to give enough info about your background so the draft team has some context for the proposals and comments you are making. * Any desired discussion by interested parties can be had on the pull request page (or here I guess, but that might be noisy) * The author can update pull request if desired based on discussion * pull requests are automatically accepted after some time period (1 week?) of no further comments o the delay in acceptance is to give time for discussion and updates to the PR o a PR merge does not indicate that the idea will be accepted into the WSGI PEP, it's just being merged into the comments repo * an individual should only update their own document, no PRs against someone else's document. o comments/discussions should go on the PR Just my $0.02. *Randy Syring* Husband | Father | Redeemed Sinner /"For what does it profit a man to gain the whole world and forfeit his soul?" (Mark 8:36 ESV)/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertc at robertcollins.net Sun Sep 21 00:19:49 2014 From: robertc at robertcollins.net (Robert Collins) Date: Sun, 21 Sep 2014 10:19:49 +1200 Subject: [Web-SIG] web-sig mailing list moderating every post? Message-ID: I'm not sure of the right place to bring this up - I tried to on the web-sig list itself, but the moderator rejected the post. What I tried to post there was """Looks like *every* post to web-sig gets manually moderated. That seems like it will make discussion rather hard: can we get that changed (or is there some historical need for it - if so, perhaps we should use python-dev or some other list) ?""" -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From robertc at robertcollins.net Sun Sep 21 00:21:47 2014 From: robertc at robertcollins.net (Robert Collins) Date: Sun, 21 Sep 2014 10:21:47 +1200 Subject: [Web-SIG] web-sig mailing list moderating every post? In-Reply-To: References: Message-ID: Ugh - this was in my mailbox shortly after the moderator action email from mailman: "No, this looks like the spam filter. Don't know what triggered it. Or why it went to you. But the list moderation is turned off (except for non-members posting to the list), and you yourself are not moderated, so... Bill" - nothing to see here, move right along, and sorry for the noise. -Rob On 21 September 2014 10:19, Robert Collins wrote: > I'm not sure of the right place to bring this up - I tried to on the > web-sig list itself, but the moderator rejected the post. > > What I tried to post there was > > """Looks like *every* post to web-sig gets manually moderated. That seems > like it will make discussion rather hard: can we get that changed (or > is there some historical need for it - if so, perhaps we should use > python-dev or some other list) ?""" > > -Rob > > -- > Robert Collins > Distinguished Technologist > HP Converged Cloud -- Robert Collins Distinguished Technologist HP Converged Cloud From robertc at robertcollins.net Sun Sep 21 00:43:42 2014 From: robertc at robertcollins.net (Robert Collins) Date: Sun, 21 Sep 2014 10:43:42 +1200 Subject: [Web-SIG] WSGI for HTTP/2.0 ? In-Reply-To: <541DC43B.7060208@thesyrings.us> References: <01fee57de2c7a2a40d19097b52360587.squirrel@manage.unbit.it> <82184701-EA0C-4A68-BF46-E7EE4D1D0BB0@gmail.com> <541DC43B.7060208@thesyrings.us> Message-ID: On 21 September 2014 06:15, Randy Syring wrote: > > I'd suggest a "wsgi comments" github repo. So in the interests of getting things done and the spirit of EAFP I've set up https://github.com/python-web-sig/wsgi-ng. Since I have no deep history in web-sig, I'll happily hand out 'organisation admin' to someone (e.g. Bill) with such history - I'm not trying to land-grab the name, just to use something sensibly named. That said I'd like to keep the direct committers to that specific repository limited to whomever manages to end up collaborating well: I have a better understanding of the burnout issue thanks to the responses in this thread. > Workflow: > > Submit a document to the repo with your comments on the future version of > WSGI > > use any readable format you want (Markdown, RST, plain text, etc.). > include name, contact information, background. Make sure to give enough > info about your background so the draft team has some context for the > proposals and comments you are making. I've proposed using github issues instead of documents; we can synthesis the issues into prose in the draft docs and reference code itself. I think this will be easier to manage than having a dozen different comment-documents in the repo. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From robertc at robertcollins.net Sun Sep 21 00:50:31 2014 From: robertc at robertcollins.net (Robert Collins) Date: Sun, 21 Sep 2014 10:50:31 +1200 Subject: [Web-SIG] REMOTE_ADDR and proxys In-Reply-To: References: Message-ID: On 11 September 2014 06:41, Collin Anderson wrote: > Hi All, > > The CGI spec says: > > Script authors should be aware that the REMOTE_ADDR and REMOTE_HOST > meta-variables (see sections 4.1.8 and 4.1.9) may not identify the > ultimate source of the request. They identify the client for the > immediate request to the server; that client may be a proxy, gateway, > or other intermediary acting on behalf of the actual source client. > > However, if the there is a revere proxy on the server side (such as > nginx), it seems to me, the ip address of the "immediate request to > the server" will be "127.0.0.1" and the actual address will be in an > "X-Forwarded-For" header. > > It seems to me, it is the role of the server/gateway, not the > application/framework to determine the "correct" client ip address and > correctly account for the situation of being behind a known proxy. > > Also, I am aware of the security issues of improperly handling > X-Forwarded-For, but that's an issue no matter where it's being > handled. > > So, in the case of a reverse proxy, is it ok if the WSGI server sends > back a REMOTE_ADDR that isn't 127.0.0.1, even if it's the immediate > connection to the WSGI server is local? > > Basically can we interpret the "server" above to be the machine rather > than the program? FWIW I think in the specific situation of a front-end proxy such as squid/nginx/varnish etc talking to a backend server that that server could set REMOTE_ADDR based on a mutually agreed header (such as X-Forwarded-For) without that having larger implications for WSGI in general. I'd also support having wsgiref support that as a basic deployment feature since it would be useful for microservices deploying within PAAS environments where a front-end LB of some sort is a given. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From randy at thesyrings.us Sun Sep 21 01:15:17 2014 From: randy at thesyrings.us (Randy Syring) Date: Sat, 20 Sep 2014 19:15:17 -0400 Subject: [Web-SIG] WSGI for HTTP/2.0 ? In-Reply-To: References: <01fee57de2c7a2a40d19097b52360587.squirrel@manage.unbit.it> <82184701-EA0C-4A68-BF46-E7EE4D1D0BB0@gmail.com> <541DC43B.7060208@thesyrings.us> Message-ID: <541E0A85.4030609@thesyrings.us> On 09/20/2014 06:43 PM, Robert Collins wrote: > On 21 September 2014 06:15, Randy Syring wrote: > >> I'd suggest a "wsgi comments" github repo. > So in the interests of getting things done and the spirit of EAFP I've > set up https://github.com/python-web-sig/wsgi-ng. Thanks for taking the initiative. > >> Workflow: >> >> Submit a document to the repo with your comments on the future version of >> WSGI >> >> use any readable format you want (Markdown, RST, plain text, etc.). >> include name, contact information, background. Make sure to give enough >> info about your background so the draft team has some context for the >> proposals and comments you are making. > I've proposed using github issues instead of documents; we can > synthesis the issues into prose in the draft docs and reference code > itself. I think this will be easier to manage than having a dozen > different comment-documents in the repo. The only disadvantage here is that you have to synthesize. With having a repo only for the comments, the only moderation that has to be done is automatic approval of a pull request after the waiting/discussion period. No synthesis is required and issues won't "hang" open and have to be managed. I was thinking the actual repo for the draft could be different from the comments repo. So, again, just my $.02. I think, unless the "draft team" develops, this will be a mute point. If it does develop, their preference for how the comments will be handled should probably take precedence. Regardless, thanks for taking the initiative...I'd rather personally see inititive and iteration as needed than see the momentum lag. *Randy Syring* Husband | Father | Redeemed Sinner /"For what does it profit a man to gain the whole world and forfeit his soul?" (Mark 8:36 ESV)/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From roberto at unbit.it Sun Sep 21 06:43:35 2014 From: roberto at unbit.it (Roberto De Ioris) Date: Sun, 21 Sep 2014 06:43:35 +0200 Subject: [Web-SIG] WSGI for HTTP/2.0 ? In-Reply-To: References: <01fee57de2c7a2a40d19097b52360587.squirrel@manage.unbit.it> <82184701-EA0C-4A68-BF46-E7EE4D1D0BB0@gmail.com> <541DC43B.7060208@thesyrings.us> Message-ID: <2f776b2db06dcdb819e19166d33a7d44.squirrel@manage.unbit.it> > > I've proposed using github issues instead of documents; we can > synthesis the issues into prose in the draft docs and reference code > itself. I think this will be easier to manage than having a dozen > different comment-documents in the repo. > > -Rob > I completely agree and i have already opened two 'issues'. If we change idea on how to work on it feel free to delete them :) -- Roberto De Ioris http://unbit.it From robertc at robertcollins.net Sun Sep 21 14:08:34 2014 From: robertc at robertcollins.net (Robert Collins) Date: Mon, 22 Sep 2014 00:08:34 +1200 Subject: [Web-SIG] WSGI for HTTP/2.0 ? In-Reply-To: <2f776b2db06dcdb819e19166d33a7d44.squirrel@manage.unbit.it> References: <01fee57de2c7a2a40d19097b52360587.squirrel@manage.unbit.it> <82184701-EA0C-4A68-BF46-E7EE4D1D0BB0@gmail.com> <541DC43B.7060208@thesyrings.us> <2f776b2db06dcdb819e19166d33a7d44.squirrel@manage.unbit.it> Message-ID: On 21 September 2014 16:43, Roberto De Ioris wrote: > >> >> I've proposed using github issues instead of documents; we can >> synthesis the issues into prose in the draft docs and reference code >> itself. I think this will be easier to manage than having a dozen >> different comment-documents in the repo. >> >> -Rob >> > > I completely agree and i have already opened two 'issues'. If we change > idea on how to work on it feel free to delete them :) Cool, thank you! I've put my thoughts up in them, and pulled out what I think are clearly sane requirements from them into a nascent requirements.rst file. I haven't closed the issues, since the actual spec covering those requirements doesn't exist. And that leads to what is I think a fairly key question. Do we: - incorporate PEP-3333 by reference [e.g. by saying 'any HTTP/1.{0,1} request will be processed as per PEP-3333'] or - do we want to alter how HTTP/1.{0,1} requests are presented (e.g. tackling encoding of headers etc) If the former, I think we have a new spec which will overlap a lot with WSGI but we can avoid talking about the war; if its the latter, I think we need a spec which is a copy-plus-adjust version of 3333. If we represent headers etc differently for HTTP/2, folk with ordinary needs will be affected - I can imagine them needing to have different codepaths for HTTP/1.1 vs HTTP/2, which would hurt. So I'd like to propose that we: - focus on great HTTP/2 semantics - have a full spec covering HTTP/1.{0,1}, HTTP/2 and websockets - and reference PEP-3333 only in the context of compatibility / shims and the like This isn't to say that I think we should make spurious changes to WSGI - I don't think we should; but I think we'll deliver a poor result if we have two different models that folk have to know about for common case 'I'm just answering a web request' scenarios. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From alan at xhaus.com Wed Sep 24 21:16:09 2014 From: alan at xhaus.com (Alan Kennedy) Date: Wed, 24 Sep 2014 20:16:09 +0100 Subject: [Web-SIG] REMOTE_ADDR and proxys In-Reply-To: References: Message-ID: [Collin] > It seems to me, it is the role of the server/gateway, not the > application/framework to determine the "correct" client ip address and > correctly account for the situation of being behind a known proxy. I disagreee. I think it is the role of the server/gateway to represent the actual incoming HTTP request as accurately as possible. If the application knows about remote proxies and local reverse proxies, then it can take action accordingly. But the server should not attempt any magic: it is up to the application to interpret the request in whatever way it sees fit. [Collin] > Also, I am aware of the security issues of improperly handling > X-Forwarded-For, but that's an issue no matter where it's being > handled. This is exactly why the server/gateway should refuse the temptation to guess. It should leave it to the application to be smart enough to handle all scenarios appropriately, knowing that it has access to the original unmodified request. If want to the magic rewriting functionality to be isolated from the application, then it could easily be implemented as middleware. Alan. On Wed, Sep 10, 2014 at 7:41 PM, Collin Anderson wrote: > Hi All, > > The CGI spec says: > > Script authors should be aware that the REMOTE_ADDR and REMOTE_HOST > meta-variables (see sections 4.1.8 and 4.1.9) may not identify the > ultimate source of the request. They identify the client for the > immediate request to the server; that client may be a proxy, gateway, > or other intermediary acting on behalf of the actual source client. > > However, if the there is a revere proxy on the server side (such as > nginx), it seems to me, the ip address of the "immediate request to > the server" will be "127.0.0.1" and the actual address will be in an > "X-Forwarded-For" header. > > It seems to me, it is the role of the server/gateway, not the > application/framework to determine the "correct" client ip address and > correctly account for the situation of being behind a known proxy. > > Also, I am aware of the security issues of improperly handling > X-Forwarded-For, but that's an issue no matter where it's being > handled. > > So, in the case of a reverse proxy, is it ok if the WSGI server sends > back a REMOTE_ADDR that isn't 127.0.0.1, even if it's the immediate > connection to the WSGI server is local? > > Basically can we interpret the "server" above to be the machine rather > than the program? > > Thanks, > Collin > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: > https://mail.python.org/mailman/options/web-sig/alan%40xhaus.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertc at robertcollins.net Wed Sep 24 22:51:28 2014 From: robertc at robertcollins.net (Robert Collins) Date: Thu, 25 Sep 2014 08:51:28 +1200 Subject: [Web-SIG] REMOTE_ADDR and proxys In-Reply-To: References: Message-ID: On 25 September 2014 07:16, Alan Kennedy wrote: > [Collin] >> It seems to me, it is the role of the server/gateway, not the >> application/framework to determine the "correct" client ip address and >> correctly account for the situation of being behind a known proxy. > > I disagreee. I think it is the role of the server/gateway to represent the > actual incoming HTTP request as accurately as possible. So I agree with you, but in a multi-tier deployment architecture: Client -> LB -> Front-end-cache -> HTTPd ->WSGI -> application, which 'request' do app developers need represented? They want the client request, which is 3 network hops away: its entirely reasonable (and supported by RFC2616 and RFC7230 etc) for the internal structure of such a deployment to extend things in such a way that normal guarantees are suspended (e.g. caching, source addresses etc). > If the application knows about remote proxies and local reverse proxies, > then it can take action accordingly. > > But the server should not attempt any magic: it is up to the application to > interpret the request in whatever way it sees fit. ... > If want to the magic rewriting functionality to be isolated from the > application, then it could easily be implemented as middleware. So middleware is an application to the layer above and a server to the layer below: how then is that not the server taking care of the rewriting? Perhaps we're stuck on a definitional thing where by server you are thinking only the code implied by e.g. serve_forever ? -Rob From robertc at robertcollins.net Thu Sep 25 05:54:19 2014 From: robertc at robertcollins.net (Robert Collins) Date: Thu, 25 Sep 2014 15:54:19 +1200 Subject: [Web-SIG] WSGI for HTTP/2.0 ? In-Reply-To: References: <01fee57de2c7a2a40d19097b52360587.squirrel@manage.unbit.it> <82184701-EA0C-4A68-BF46-E7EE4D1D0BB0@gmail.com> <541DC43B.7060208@thesyrings.us> <2f776b2db06dcdb819e19166d33a7d44.squirrel@manage.unbit.it> Message-ID: On 22 September 2014 00:08, Robert Collins wrote: > On 21 September 2014 16:43, Roberto De Ioris wrote: >> >>> >>> I've proposed using github issues instead of documents; we can >>> synthesis the issues into prose in the draft docs and reference code >>> itself. I think this will be easier to manage than having a dozen >>> different comment-documents in the repo. >>> >>> -Rob >>> >> >> I completely agree and i have already opened two 'issues'. If we change >> idea on how to work on it feel free to delete them :) > > Cool, thank you! > > I've put my thoughts up in them, and pulled out what I think are > clearly sane requirements from them into a nascent requirements.rst > file. > > I haven't closed the issues, since the actual spec covering those > requirements doesn't exist. And that leads to what is I think a fairly > key question. > > Do we: > - incorporate PEP-3333 by reference [e.g. by saying 'any HTTP/1.{0,1} > request will be processed as per PEP-3333'] > or > - do we want to alter how HTTP/1.{0,1} requests are presented (e.g. > tackling encoding of headers etc) Timing the question out: I'm going with the latter case: a clean new spec with consistent handling of feature that are common to all the supported protocols, and folk that want existing things to keep running wrap them with an adapter we'll provide. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From robertc at robertcollins.net Thu Sep 25 20:52:05 2014 From: robertc at robertcollins.net (Robert Collins) Date: Fri, 26 Sep 2014 06:52:05 +1200 Subject: [Web-SIG] WSGI server handling absolute URI In-Reply-To: References: Message-ID: I think this makes sense to address - have replied to the ticket. but tl;dr: inconsistency in this space is likely to provoke bugs, so lets make things consistent. -Rob On 12 May 2014 00:45, mouad ben wrote: > Hello, > > My name is Mouad and this is my first time writing to this mailing list. > > I hope this is the right mailing list to let interested party to know about > a "minor" bug that i found in some WSGI server that are out there, that > doesn't support absolute URI in an raw http request i.e. > > GET http://domain.com/path HTTP/1.1 > > I have created an issue for WGIREF http://bugs.python.org/issue21472, and i > am waiting for feedback from cPython core developers, and most > **importantly** this gist https://gist.github.com/mouadino/7930974 that show > who support this feature and who doesn't. > > HTH, > > -- > Mouad Benchchaoui > > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: > https://mail.python.org/mailman/options/web-sig/robertc%40robertcollins.net > -- Robert Collins Distinguished Technologist HP Converged Cloud From robertc at robertcollins.net Thu Sep 25 22:44:30 2014 From: robertc at robertcollins.net (Robert Collins) Date: Fri, 26 Sep 2014 08:44:30 +1200 Subject: [Web-SIG] Nodejs cluster In-Reply-To: <634914A010D0B943A035D226786325D44473AC4860@EXVMBX020-12.exch020.serverdata.net> References: <20140318123710.6218.1600151976.divmod.xquotient.314@top> <53284424.5090708@simplistix.co.uk> <634914A010D0B943A035D226786325D44473AC4860@EXVMBX020-12.exch020.serverdata.net> Message-ID: On 19 March 2014 22:02, Tobias Oberstein wrote: > We are working on a system (on top of Autobahn) which provides builtin scale-up (multi-core) and scale-out (multi-node) capabilities: > > https://github.com/crossbario/crossbar > > This is work in progress and relies on WAMPv2. Here are a couple of links ... Interesting. That looks very similar to Mongrel2. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From robertc at robertcollins.net Fri Sep 26 05:32:24 2014 From: robertc at robertcollins.net (Robert Collins) Date: Fri, 26 Sep 2014 15:32:24 +1200 Subject: [Web-SIG] WSGI2: write callable? Message-ID: Is the write callable still needed? Its documented as a undesirable thunk in PEP-3333; is there a good reason to keep it, or can we make start_response return None and require the use of a generator to supply content for the body? (Remembering that for backwards compatibility we're going to write an adapter, and a generator adapter is straightforward (if tedious) using a threading.Queue). I haven't done a survey, but I don't recall seeing anything except bespoke WSGI code that used the write interface - all the frameworks I've seen in some time use the iterator protocol. So I propose we drop the write callable, and include a queue based implementation in the adapter for PEP-3333 code. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From robertc at robertcollins.net Fri Sep 26 05:48:16 2014 From: robertc at robertcollins.net (Robert Collins) Date: Fri, 26 Sep 2014 15:48:16 +1200 Subject: [Web-SIG] WSGI: start_response buffers headers Message-ID: https://github.com/python-web-sig/wsgi-ng/issues/4 So doing bidirectional streaming is somewhat incompatible with requiring that headers be buffered until actual server side content is available (what if the client is meant to write first... but is itself waiting for the stream to be established). I think making an empty bytestream flush the headers would be sufficient, and preserve much of the niceness. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From robertc at robertcollins.net Fri Sep 26 06:14:10 2014 From: robertc at robertcollins.net (Robert Collins) Date: Fri, 26 Sep 2014 16:14:10 +1200 Subject: [Web-SIG] WSGI: allowing short reads Message-ID: https://github.com/python-web-sig/wsgi-ng/issues/5 tl;dr - we don't specify whether read(size) has to return size bytes or just not more than size, today. the IO library is clear that read(n) returns up to n, and also offers read1 that guarantees only one read call. I don't think we need read1 (perhaps I'm wrong) but making read consistent with the io library would be good, I think - particularly for websockets. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From dirkjan at ochtman.nl Fri Sep 26 08:14:29 2014 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Fri, 26 Sep 2014 08:14:29 +0200 Subject: [Web-SIG] WSGI2: write callable? In-Reply-To: References: Message-ID: On Fri, Sep 26, 2014 at 5:32 AM, Robert Collins wrote: > So I propose we drop the write callable, and include a queue based > implementation in the adapter for PEP-3333 code. +1. Cheers, Dirkjan From dirkjan at ochtman.nl Fri Sep 26 08:16:33 2014 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Fri, 26 Sep 2014 08:16:33 +0200 Subject: [Web-SIG] WSGI: allowing short reads In-Reply-To: References: Message-ID: On Fri, Sep 26, 2014 at 6:14 AM, Robert Collins wrote: > I don't think we need read1 (perhaps I'm wrong) but making read > consistent with the io library would be good, I think - particularly > for websockets. I would agree, but for websockets, I'd really want a per-frame generator or something. I've always used JSON messages that fit in a WebSockets frame, so I don't actually need to look for message/object boundaries. Cheers, Dirkjan From robertc at robertcollins.net Fri Sep 26 10:06:58 2014 From: robertc at robertcollins.net (Robert Collins) Date: Fri, 26 Sep 2014 20:06:58 +1200 Subject: [Web-SIG] WSGI: allowing short reads In-Reply-To: References: Message-ID: On 26 September 2014 18:16, Dirkjan Ochtman wrote: > On Fri, Sep 26, 2014 at 6:14 AM, Robert Collins > wrote: >> I don't think we need read1 (perhaps I'm wrong) but making read >> consistent with the io library would be good, I think - particularly >> for websockets. > > I would agree, but for websockets, I'd really want a per-frame > generator or something. I've always used JSON messages that fit in a > WebSockets frame, so I don't actually need to look for message/object > boundaries. Ok, we can drill down into that more deeply in the websocket context. I guess a bunch of context around here will depend on whether websockets is implemented as a primitive or if we want to make it possible to implement websockets on top of a more basic primitive. mod_wsgi (and other containers) may not be able to expose a raw 'socket' at any point, so I'd lean towards exposing a structured thing. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From pje at telecommunity.com Fri Sep 26 22:07:03 2014 From: pje at telecommunity.com (PJ Eby) Date: Fri, 26 Sep 2014 16:07:03 -0400 Subject: [Web-SIG] WSGI: start_response buffers headers In-Reply-To: References: Message-ID: On Thu, Sep 25, 2014 at 11:48 PM, Robert Collins wrote: > I think making an empty bytestream flush the headers would be > sufficient, and preserve much of the niceness. FWIW, switching to the `app(environ) -> status, headers, body` calling signature gets rid of this issue as well, since there is no longer a start_response. ;-) From pje at telecommunity.com Fri Sep 26 21:58:13 2014 From: pje at telecommunity.com (PJ Eby) Date: Fri, 26 Sep 2014 15:58:13 -0400 Subject: [Web-SIG] WSGI2: write callable? In-Reply-To: References: Message-ID: On Thu, Sep 25, 2014 at 11:32 PM, Robert Collins wrote: > So I propose we drop the write callable, and include a queue based > implementation in the adapter for PEP-3333 code. If you're dropping write(), then you might as well drop start_response() altogether, and replace it with returning a (status, headers, body-iterator) tuple, as in wsgi_lite ( https://github.com/pjeby/wsgi_lite ) or as found in other languages' versions of WSGI. (start_response+write was only ever needed in order to support legacy apps, so other languages never bothered.) wsgi_lite has a couple of other protocol extensions, namely the 'wsgi_lite.closing' environment key, flagging callables' supported WSGI version (for transparent interop), and the argument binding protocol, but for the most part these are orthogonal to the calling schema. I would suggest, however, that the calling protocol be flagged in some way to allow easier interop. From bchesneau at gmail.com Fri Sep 26 22:21:57 2014 From: bchesneau at gmail.com (Benoit Chesneau) Date: Fri, 26 Sep 2014 22:21:57 +0200 Subject: [Web-SIG] WSGI2: write callable? In-Reply-To: References: Message-ID: On Fri, Sep 26, 2014 at 5:32 AM, Robert Collins wrote: > Is the write callable still needed? Its documented as a undesirable > thunk in PEP-3333; is there a good reason to keep it, or can we make > start_response return None and require the use of a generator to > supply content for the body? > > (Remembering that for backwards compatibility we're going to write an > adapter, and a generator adapter is straightforward (if tedious) using > a threading.Queue). > > I haven't done a survey, but I don't recall seeing anything except > bespoke WSGI code that used the write interface - all the frameworks > I've seen in some time use the iterator protocol. > > So I propose we drop the write callable, and include a queue based > implementation in the adapter for PEP-3333 code. > > -Rob > > What would be the advantage of using a queue compared to simply write to the server? Internally the server can use queue, but why the client should know it? What is the reasoning behind it? - benoit > -- > Robert Collins > Distinguished Technologist > HP Converged Cloud > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: > https://mail.python.org/mailman/options/web-sig/bchesneau%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bchesneau at gmail.com Fri Sep 26 22:26:04 2014 From: bchesneau at gmail.com (Benoit Chesneau) Date: Fri, 26 Sep 2014 22:26:04 +0200 Subject: [Web-SIG] WSGI2: write callable? In-Reply-To: References: Message-ID: On Fri, Sep 26, 2014 at 9:58 PM, PJ Eby wrote: > On Thu, Sep 25, 2014 at 11:32 PM, Robert Collins > wrote: > > So I propose we drop the write callable, and include a queue based > > implementation in the adapter for PEP-3333 code. > > If you're dropping write(), then you might as well drop > start_response() altogether, and replace it with returning a (status, > headers, body-iterator) tuple, as in wsgi_lite ( > https://github.com/pjeby/wsgi_lite ) or as found in other languages' > versions of WSGI. (start_response+write was only ever needed in order > to support legacy apps, so other languages never bothered.) > > wsgi_lite has a couple of other protocol extensions, namely the > 'wsgi_lite.closing' environment key, flagging callables' supported > WSGI version (for transparent interop), and the argument binding > protocol, but for the most part these are orthogonal to the calling > schema. I would suggest, however, that the calling protocol be > flagged in some way to allow easier interop. > I quite like the idea of always returning an iterator for the body it would simplify the code a lot... About returning the status and other thing, I quite agree, but imo we also need to return an extra parameter where the application or the middleware could maintain a state or something like it. Thoughts? - benoit > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: > https://mail.python.org/mailman/options/web-sig/bchesneau%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertc at robertcollins.net Fri Sep 26 23:02:17 2014 From: robertc at robertcollins.net (Robert Collins) Date: Sat, 27 Sep 2014 09:02:17 +1200 Subject: [Web-SIG] WSGI2: write callable? In-Reply-To: References: Message-ID: On 27 September 2014 07:58, PJ Eby wrote: > On Thu, Sep 25, 2014 at 11:32 PM, Robert Collins > wrote: >> So I propose we drop the write callable, and include a queue based >> implementation in the adapter for PEP-3333 code. > > If you're dropping write(), then you might as well drop > start_response() altogether, and replace it with returning a (status, > headers, body-iterator) tuple, as in wsgi_lite ( > https://github.com/pjeby/wsgi_lite ) or as found in other languages' > versions of WSGI. (start_response+write was only ever needed in order > to support legacy apps, so other languages never bothered.) Ahha! useful history. That would save a load of complexity on server and middleware authors behalf. HTTP/2 has moved status to a pseudo header and dropped the status reason, so we could also phrase this as: (headers, body-iterator) Also, (filing an issue now) we also need to support Trailers, which HTTP/2 has preserved. So we need a final header block facility as well. That structure would lead to (headers, body-iterator, trailers-callback) But perhaps it would be nicer to say: iterator of headers_dict_or_body_bytes With the first item yielded having to be headers (or error thrown),and the last item yielded may be a dict to emit trailers. So: def app(environ): yield {':status': '200'} yield b'hello world' yield {'Foo': 'Bar'} is an entirely valid, if trivial, app. What do you think? > wsgi_lite has a couple of other protocol extensions, namely the > 'wsgi_lite.closing' environment key, flagging callables' supported > WSGI version (for transparent interop), and the argument binding > protocol, but for the most part these are orthogonal to the calling > schema. I would suggest, however, that the calling protocol be > flagged in some way to allow easier interop. We're bumping the WSGI version, will that serve as a sufficient flag? The closing thing is nice - its basically unittest.TestCase.addCleanup for WSGI, allowing apps to not have to write a deep nested finally. Lets start a new thread about the design for that specifically. You note that exception management isn't defined yet - perhaps we can tackle that as a group? -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From robertc at robertcollins.net Fri Sep 26 23:05:54 2014 From: robertc at robertcollins.net (Robert Collins) Date: Sat, 27 Sep 2014 09:05:54 +1200 Subject: [Web-SIG] WSGI2: write callable? In-Reply-To: References: Message-ID: On 27 September 2014 08:21, Benoit Chesneau wrote: > > > On Fri, Sep 26, 2014 at 5:32 AM, Robert Collins > wrote: ... >> So I propose we drop the write callable, and include a queue based >> implementation in the adapter for PEP-3333 code. >> >> -Rob >> > > What would be the advantage of using a queue compared to simply write to the > server? Internally the server can use queue, but why the client should know > it? What is the reasoning behind it? The point is to remove the complexity of having both an iterator over content *and* a write method. Thats really complex for server [and middleware] writers. So the interface to send bytes to the container would just be 'yield them'. (Or return a fully populated list). So the point about the Queue is that to support PEP-3333 we either need to retain the write() callable, or we need an adapter that can expose on its upper side the iterator we want, and on the lower side accept *either* an iterator *or* use of write() method - I think you'll find thats quite hard to write without a Queue or similar construct. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From pje at telecommunity.com Sat Sep 27 00:31:33 2014 From: pje at telecommunity.com (PJ Eby) Date: Fri, 26 Sep 2014 18:31:33 -0400 Subject: [Web-SIG] WSGI2: write callable? In-Reply-To: References: Message-ID: On Fri, Sep 26, 2014 at 5:02 PM, Robert Collins wrote: > But perhaps it would be nicer to say: > iterator of headers_dict_or_body_bytes > With the first item yielded having to be headers (or error thrown),and > the last item yielded may be a dict to emit trailers. > > So: > def app(environ): > yield {':status': '200'} > yield b'hello world' > yield {'Foo': 'Bar'} > > is an entirely valid, if trivial, app. > > What do you think? I think this would make it harder to write middleware, actually, and for the same reason that I dislike folding status into the headers. It's a case of "flat is better than nested", I think, in both cases. That is, if the status is always required, it's easier to validate its presence in a 3-tuple than nested inside another data structure. As far as trailers go, I'm not sure what those are used for or how they'd be used in practice, but my initial thought is that they should be attached to the response body, analagous to how FileWrapper works. The other alternative is to use a dict as the response object (analagous to environ as the request object), with named keys for status, headers, trailers, body, etc. It would then be extensible to handle things like the "Associated content" concept. In this way, middleware that is simply passing things through unchanged can do so, while middleware that is creating a new response can discard the old object. >> wsgi_lite has a couple of other protocol extensions, namely the >> 'wsgi_lite.closing' environment key, flagging callables' supported >> WSGI version (for transparent interop), and the argument binding >> protocol, but for the most part these are orthogonal to the calling >> schema. I would suggest, however, that the calling protocol be >> flagged in some way to allow easier interop. > > We're bumping the WSGI version, will that serve as a sufficient flag? I mean, flagged on the app end. For example, wsgi_lite marks apps that support wsgi_lite with a true-valued `__wsgi_lite__` attribute. In this way, a container invoking the app knows it can be called with just an environ (and no start_response). So, I'm saying that an app callable would opt in to this new WSGI version, so that servers and middleware don't need to grow new APIs for registering apps -- they can auto-detect. Also, having auto-detection means you can write a decorator (e.g. in wsgiref), to wrap and convert WSGI 1 apps to WSGI 2, without needing to know if you're passing something already wrapped. It means that a WSGI 2 server or middleware can just wrap whatever apps it sees, and get back a WSGI 2 app, whether the thing it got was WSGI 1 or WSGI 2. > The closing thing is nice - its basically unittest.TestCase.addCleanup > for WSGI, allowing apps to not have to write a deep nested finally. > Lets start a new thread about the design for that specifically. You > note that exception management isn't defined yet - perhaps we can > tackle that as a group? Sure. From robertc at robertcollins.net Sat Sep 27 01:41:59 2014 From: robertc at robertcollins.net (Robert Collins) Date: Sat, 27 Sep 2014 11:41:59 +1200 Subject: [Web-SIG] WSGI2: write callable? In-Reply-To: References: Message-ID: On 27 September 2014 10:31, PJ Eby wrote: > On Fri, Sep 26, 2014 at 5:02 PM, Robert Collins > wrote: >> But perhaps it would be nicer to say: >> iterator of headers_dict_or_body_bytes >> With the first item yielded having to be headers (or error thrown),and >> the last item yielded may be a dict to emit trailers. >> >> So: >> def app(environ): >> yield {':status': '200'} >> yield b'hello world' >> yield {'Foo': 'Bar'} >> >> is an entirely valid, if trivial, app. >> >> What do you think? > > I think this would make it harder to write middleware, actually, and > for the same reason that I dislike folding status into the headers. > It's a case of "flat is better than nested", I think, in both cases. > That is, if the status is always required, it's easier to validate its > presence in a 3-tuple than nested inside another data structure. I'm intrigued here - validation of the status code is tied into into the details of the headers. For instance, 301/302 need a Location header to be valid. So I don't understand how its any easier with status split out. I'd be delighted to whip up a few constrasting middleware samples to let us compare and contrast. Note too that folk can still return bad status codes with a different layout (status, headers, body, trailers) return None, {}, [], {} One thing we could do with the status code in the headers dict is to default to 200 - the vastly common case (in the same way that throwing an error generates a 500). Then status wouldn't be required at all for trivial uses. That would make things easier, no? > As > far as trailers go, I'm not sure what those are used for or how they'd > be used in practice, but my initial thought is that they should be > attached to the response body, analagous to how FileWrapper works. So a classic example for Trailers is digitally signing streamed content. Using the same strawman API as above: def app(environ): yield {':status': '200} md5sum = md5.new() for bytes in block_reader(open('foo', 'rb'), 65536): md5sum.update(bytes) yield bytes digest = md5sum.hexdigest() signature = sign_bytes(digest.encode('utf8')) yield {'Content-MD5Sum': digest, 'X-Signature': signature} Note that this doesn't need to buffer or use a closure. Writing that with a callback for trailers (which is the only alternative - its either a callback or a generator - because until the body is fully handled the content of the trailers cannot be determined): def app(environ): md5sum = md5.new() def body(): for bytes in block_reader(open('foo', 'rb'), 65536): md5sum.update(bytes) yield bytes def trailers(): digest = md5sum.hexdigest() signature = sign_bytes(digest.encode('utf8')) yield {'Content-MD5Sum': digest, 'X-Signature': signature} return '200', {}, body, trailers > The other alternative is to use a dict as the response object > (analagous to environ as the request object), with named keys for > status, headers, trailers, body, etc. It would then be extensible to > handle things like the "Associated content" concept. That might work, though it will force more closures. One of the things I like about the generator style is the clarity in code that we can achieve. > In this way, middleware that is simply passing things through > unchanged can do so, while middleware that is creating a new response > can discard the old object. That seems to apply either way, right? Here's a body-size logging middleware: def logger(app): def middleware(environ): wrapped = app(environ) yield next(wrapped) body_bytes = 0 for maybe_body in wrapped: if type(maybe_body) is bytes: body_bytes += len(maybe_body) yield maybe_body logging.info("Saw %d bytes for %s" % (body_bytes, environ['PATH_INFO'])) return middleware .. >> We're bumping the WSGI version, will that serve as a sufficient flag? > > I mean, flagged on the app end. For example, wsgi_lite marks apps > that support wsgi_lite with a true-valued `__wsgi_lite__` attribute. > In this way, a container invoking the app knows it can be called with > just an environ (and no start_response). Ok, So we'd use the absence of such a mark to trigger the WSGI1 adapter automagically? I'm curious if that will work well enough we are given wsgi_lite or other extensions to wsgi. Perhaps we should refuse to guess and just supply the adapters and instructions? > So, I'm saying that an app callable would opt in to this new WSGI > version, so that servers and middleware don't need to grow new APIs > for registering apps -- they can auto-detect. Also, having > auto-detection means you can write a decorator (e.g. in wsgiref), to > wrap and convert WSGI 1 apps to WSGI 2, without needing to know if > you're passing something already wrapped. It means that a WSGI 2 > server or middleware can just wrap whatever apps it sees, and get back > a WSGI 2 app, whether the thing it got was WSGI 1 or WSGI 2. Thats certainly a desirable property. If we've changed things too much to infer by the basic structure then we'll need some metadata for it. Works for me - I'd like to have a decorator for that: def logger(app): @wsgi2 def middleware(environ): ... return middleware -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From pje at telecommunity.com Sat Sep 27 05:44:56 2014 From: pje at telecommunity.com (PJ Eby) Date: Fri, 26 Sep 2014 23:44:56 -0400 Subject: [Web-SIG] WSGI2: write callable? In-Reply-To: References: Message-ID: On Fri, Sep 26, 2014 at 7:41 PM, Robert Collins wrote: > One thing we could do with the status code in the headers dict is to > default to 200 - the vastly common case (in the same way that throwing > an error generates a 500). Then status wouldn't be required at all for > trivial uses. That would make things easier, no? At the cost of variation. A core design principle of WSGI is that variations make things *harder*, not easier, because it means more alternatives that apps, servers, and middleware have to support, with more code paths and fewer of them properly tested. Every variation that is part of the spec (as opposed to an extension), creates a LOT of complexity in the field. (Which is one reason it'll be nice to get rid of start_response(), and all its convoluted sequencing logics.) > So a classic example for Trailers is digitally signing streamed > content. Using the same strawman API as above: > > def app(environ): > yield {':status': '200} > md5sum = md5.new() > for bytes in block_reader(open('foo', 'rb'), 65536): > md5sum.update(bytes) > yield bytes > digest = md5sum.hexdigest() > signature = sign_bytes(digest.encode('utf8')) > yield {'Content-MD5Sum': digest, 'X-Signature': signature} > > Note that this doesn't need to buffer or use a closure. Please bear in mind that another core WSGI design principle is that we don't make apps easier to write by making servers and middleware harder to write. That kills adoption and growth, because the audience that *needs* to adopt WSGI (or any successor standard) is the audience of people who write servers and middleware. If a feature is sinfully ugly for the app writer, but a thing of beauty for a middleware author, we *want* that feature. Conversely, if a feature means that *every* piece of middleware now has to add an extra "if" statement to support the feature in order to make it pretty for the app writer, then we do NOT want that feature, and it should be taken out and shot *at once*. It's not a fair tradeoff, because only server authors and middleware authors *have to* deal with WSGI directly. App authors can use libraries to pretty it up, so we don't need to pretty it for them in advance -- especially since we don't know what their *personal* idea of pretty is going to be. ;-) The above API is cute and clean for the app writer, but for a middleware writer it's a barrel of misery. *Every* piece of middleware that even wants to *read* anything from the response (let alone modify it), now needs to check types of yielded values, accumulate headers, and maybe buffer content. And there are many ways to write that middleware that will be wrong, but *appear* right because the author didn't think of all the ways that an app could violate the middleware author's assumptions. On the other hand, if somebody wants to make a library implementing a similar API to your proposal *on top* of WSGI, then sure, why not? That's fine: it only adds overhead at a *single point*": the library that implements the pretty API on top of WSGI. > Writing that with a callback for trailers (which is the only > alternative - its either a callback or a generator - because until the > body is fully handled the content of the trailers cannot be > determined): Doesn't look bad to me. It'd also be fine as a method on the response body, and that would let us stick to (status, headers, body) as a return value. >> The other alternative is to use a dict as the response object >> (analagous to environ as the request object), with named keys for >> status, headers, trailers, body, etc. It would then be extensible to >> handle things like the "Associated content" concept. > > That might work, though it will force more closures. One of the things > I like about the generator style is the clarity in code that we can > achieve. Please try to think instead of how you could implement those things in a "make it nice" API for app authors. WSGI wasn't made ugly on a whim; it's the direct result of some very important design principles. While the need for start_response() is gone, many of the other reasons for its ugliness remain. (In any case, you can still implement a generator-based API for writing WSGI apps, without needing to make WSGI *itself* be implemented that way.) > Here's a body-size logging middleware: > > def logger(app): > def middleware(environ): > wrapped = app(environ) > yield next(wrapped) > body_bytes = 0 > for maybe_body in wrapped: > if type(maybe_body) is bytes: > body_bytes += len(maybe_body) > yield maybe_body > logging.info("Saw %d bytes for %s" % (body_bytes, environ['PATH_INFO'])) > return middleware Perhaps you meant this as a sketch, but note that you're not calling close() on the underlying iterator. At minimum, you need a try/finally to do that, or else you need to use the wsgi_lite closing extension -- and you need to assume that your parent middleware or server is calling the closing extension on your response as well. (Just another issue with implementing the core API based on generators, since a generator function doesn't have access to its own return result -- i.e., generator instance.) After your original proposal, I actually gave some thought to the benefits of implementing a pure-generator-based WSGI. In theory, there are some good things you can do with it. In practice, though, you pay a fairly high price in complexity for everything but the already-complicated cases. To put it another way, the common case for WSGI always was -- and mostly still is -- to return an entire HTTP response in one go, without any streaming or buffering or anything of that sort. And simple things should be simple, with complex things still being possible. Unfortunately, making the raw API generator-based benefits the complex cases at the expense of the simple ones, at the middleware level. It should be *possible* to do the fancy things, but not at the expense of making every piece of middleware more complex. Not as long as app writers can use an app-level library to get a nicer API, but middleware and server authors have to *always* deal with the bare metal. So, let's trim the sharp edges for the poor middleware and server developers, rather than polishing the bits that app writers aren't going to be using, anyway. (Since most of them are going to be using Django, Pyramid, Flask, or whatever the latest hotness is, anyway.) > Thats certainly a desirable property. If we've changed things too much > to infer by the basic structure then we'll need some metadata for it. > Works for me - I'd like to have a decorator for that: A decorator is an API; WSGI is a protocol. Of course people can use decorators to implement the protocol, and wsgiref2 (or whatever) should include some. But the spec should define what metadata the decorator would expose, rather than dictating the use of any particular decorator. From robertc at robertcollins.net Sat Sep 27 06:20:14 2014 From: robertc at robertcollins.net (Robert Collins) Date: Sat, 27 Sep 2014 16:20:14 +1200 Subject: [Web-SIG] WSGI2: write callable? In-Reply-To: References: Message-ID: On 27 September 2014 15:44, PJ Eby wrote: > On Fri, Sep 26, 2014 at 7:41 PM, Robert Collins > wrote: >> One thing we could do with the status code in the headers dict is to >> default to 200 - the vastly common case (in the same way that throwing >> an error generates a 500). Then status wouldn't be required at all for >> trivial uses. That would make things easier, no? > > At the cost of variation. A core design principle of WSGI is that > variations make things *harder*, not easier, because it means more > alternatives that apps, servers, and middleware have to support, with > more code paths and fewer of them properly tested. Every variation > that is part of the spec (as opposed to an extension), creates a LOT > of complexity in the field. (Which is one reason it'll be nice to get > rid of start_response(), and all its convoluted sequencing logics.) We should capture these design principles somewhere FAQ-like, since many of the folk participating in this rework weren't part of the original design. Right now, anything providing the server profile has to cope with exceptions and translate those to 500 errors, so we have the variation of 'status and headers may not be provided'. Most middleware can be oblivious and delegate this to the server via bubble-up. I suspect the same would work for a default of 200 - 99% of middleware would ignore it and it would just work. However, I'm not super attached - it was just an idea. >> So a classic example for Trailers is digitally signing streamed >> content. Using the same strawman API as above: >> >> def app(environ): >> yield {':status': '200} >> md5sum = md5.new() >> for bytes in block_reader(open('foo', 'rb'), 65536): >> md5sum.update(bytes) >> yield bytes >> digest = md5sum.hexdigest() >> signature = sign_bytes(digest.encode('utf8')) >> yield {'Content-MD5Sum': digest, 'X-Signature': signature} >> >> Note that this doesn't need to buffer or use a closure. > > Please bear in mind that another core WSGI design principle is that we > don't make apps easier to write by making servers and middleware > harder to write. That kills adoption and growth, because the audience > that *needs* to adopt WSGI (or any successor standard) is the audience > of people who write servers and middleware. If a feature is sinfully > ugly for the app writer, but a thing of beauty for a middleware > author, we *want* that feature. I get that to a degree - I think there is a balance to be struck. This is why I'd like to put a few middleware examples together to compare and contrast different start_response replacement APIs. > Conversely, if a feature means that *every* piece of middleware now > has to add an extra "if" statement to support the feature in order to > make it pretty for the app writer, then we do NOT want that feature, > and it should be taken out and shot *at once*. Agreed. > It's not a fair tradeoff, because only server authors and middleware > authors *have to* deal with WSGI directly. App authors can use > libraries to pretty it up, so we don't need to pretty it for them in > advance -- especially since we don't know what their *personal* idea > of pretty is going to be. ;-) Server authors and middleware authors can use libraries too: we can write functions to provide common handling for a bunch of stuff: thats not to say we should make things bad at the API level - we shouldn't - but it doesn't make sense to me to say that folk writing middleware cannot use libraries. > The above API is cute and clean for the app writer, but for a > middleware writer it's a barrel of misery. *Every* piece of > middleware that even wants to *read* anything from the response (let > alone modify it), now needs to check types of yielded values, > accumulate headers, and maybe buffer content. And there are many ways > to write that middleware that will be wrong, but *appear* right > because the author didn't think of all the ways that an app could > violate the middleware author's assumptions. Hang on, why would they buffer content? Buffering response content is currently verboten, and I haven't seen any proposal to change that. I don't understand how phrasing the API as I suggested would lead to buffering being permitted or required. The type checking does squick me a little, so back to drawing board. > On the other hand, if somebody wants to make a library implementing a > similar API to your proposal *on top* of WSGI, then sure, why not? > That's fine: it only adds overhead at a *single point*": the library > that implements the pretty API on top of WSGI. > > >> Writing that with a callback for trailers (which is the only >> alternative - its either a callback or a generator - because until the >> body is fully handled the content of the trailers cannot be >> determined): > > Doesn't look bad to me. It'd also be fine as a method on the response > body, and that would let us stick to (status, headers, body) as a > return value. If its a method on the response body, the returning a list or generator no longer works, unless you start poking random attributes onto things. It would also be inconsistent - why would trailers be a method on the response, but headers be a dict in the return value? >>> The other alternative is to use a dict as the response object >>> (analagous to environ as the request object), with named keys for >>> status, headers, trailers, body, etc. It would then be extensible to >>> handle things like the "Associated content" concept. >> >> That might work, though it will force more closures. One of the things >> I like about the generator style is the clarity in code that we can >> achieve. > > Please try to think instead of how you could implement those things in > a "make it nice" API for app authors. WSGI wasn't made ugly on a > whim; it's the direct result of some very important design principles. > While the need for start_response() is gone, many of the other reasons > for its ugliness remain. > > (In any case, you can still implement a generator-based API for > writing WSGI apps, without needing to make WSGI *itself* be > implemented that way.) I don't think WSGI is ugly, but I do think that things have changed substantially in the python world since it came to be, and we owe it to ourselves to investigate whether we can do better now. Is there some documentation about the other reasons that it needs to be ugly - last thing I want to do is waste folks time suggesting things that won't work. >> Here's a body-size logging middleware: >> >> def logger(app): >> def middleware(environ): >> wrapped = app(environ) >> yield next(wrapped) >> body_bytes = 0 >> for maybe_body in wrapped: >> if type(maybe_body) is bytes: >> body_bytes += len(maybe_body) >> yield maybe_body >> logging.info("Saw %d bytes for %s" % (body_bytes, environ['PATH_INFO'])) >> return middleware > > Perhaps you meant this as a sketch, but note that you're not calling > close() on the underlying iterator. Indeed, I forgot that, and right after I replied to one of the issues saying the API would need a close method :). > At minimum, you need a > try/finally to do that, or else you need to use the wsgi_lite closing > extension -- and you need to assume that your parent middleware or > server is calling the closing extension on your response as well. > (Just another issue with implementing the core API based on > generators, since a generator function doesn't have access to its own > return result -- i.e., generator instance.) The original WSGI spec avoiding defining objects on the basis of being extremely minimal, to ease adoption - and its been a wild success. How much complexity are we starting to drive though, as we keep avoiding having an object - tuple return types, iterators with extra attributes. Would a defined ABC be a burden to implementors these days? I presume that it was the C servers like mod_python that we would have harmed previously? > After your original proposal, I actually gave some thought to the > benefits of implementing a pure-generator-based WSGI. In theory, > there are some good things you can do with it. In practice, though, > you pay a fairly high price in complexity for everything but the > already-complicated cases. > To put it another way, the common case for WSGI always was -- and > mostly still is -- to return an entire HTTP response in one go, > without any streaming or buffering or anything of that sort. And > simple things should be simple, with complex things still being > possible. I would be interesting to get stats on that. The WSGI spec goes to great pains to require that streaming work and buffering be verboten (presumably excusable for middleware like JPEG->PNG transformers that simply cannot avoid buffering) - but even then they are required to yield a b'' AIUI. I know that the vast majority of things I write try not to buffer unless absolutely necessary, and HTTP's last three major revisions - 1.1, 1.1bis and 2 - have all had polish done around making streaming more reliable in the internet we live in. Content-Length is now an interesting aberration only really needed for large static content - the wire protocols have no need of it (they did in HTTP/0.9 and HTTP/1.0). Its useful for progress bars for big downloads, and thats about it. But your points about simple and complex are interesting. Middleware authors need to cater to everything - so making the simple simple doesn't make it simple for middleware authors - they don't get to opt out. Its only by making everything as simple - uncomplected[1] - as possible that we keep things easy for server and middleware authors. > Unfortunately, making the raw API generator-based benefits the complex > cases at the expense of the simple ones, at the middleware level. It > should be *possible* to do the fancy things, but not at the expense of > making every piece of middleware more complex. Not as long as app > writers can use an app-level library to get a nicer API, but > middleware and server authors have to *always* deal with the bare > metal. I still disagree that middleware and server authors cannot get a nicer API through libraries. The different between middleware or server and apps is that apps can choose not to care about things they don't care about, whereas middleware and servers have to care - but appropriate helper functions can still help them. > So, let's trim the sharp edges for the poor middleware and server > developers, rather than polishing the bits that app writers aren't > going to be using, anyway. (Since most of them are going to be using > Django, Pyramid, Flask, or whatever the latest hotness is, anyway.) Do you have a hitlist of such sharp edges you'd like to see catered for in this new spec? >> Thats certainly a desirable property. If we've changed things too much >> to infer by the basic structure then we'll need some metadata for it. >> Works for me - I'd like to have a decorator for that: > > A decorator is an API; WSGI is a protocol. Of course people can use > decorators to implement the protocol, and wsgiref2 (or whatever) > should include some. But the spec should define what metadata the > decorator would expose, rather than dictating the use of any > particular decorator. That seems reasonable, presumably because any code we write will not be backported to older standard libraries. I think it would be a mistake to not think about the default experience as well though: we should specify the protocol, and offer a good default API on top of it. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From antoine at python.org Sat Sep 27 14:00:14 2014 From: antoine at python.org (Antoine Pitrou) Date: Sat, 27 Sep 2014 12:00:14 +0000 (UTC) Subject: [Web-SIG] WSGI: allowing short reads References: Message-ID: Hi, Robert Collins writes: > > https://github.com/python-web-sig/wsgi-ng/issues/5 > > tl;dr - we don't specify whether read(size) has to return size bytes > or just not more than size, today. the IO library is clear that > read(n) returns up to n, and also offers read1 that guarantees only > one read call. I think you've got things mixed up. There are two different things in "the IO library" (which is really the 3.x IO stack): * buffered binary I/O has read(n) and read1(n): - read(n) will block until n bytes are received (except for non-blocking fds) - read1(n) will issue at most one system call and can return fewer than n bytes * raw binary I/O only has read(n): - read(n) will issue at most one system call and can return fewer than n bytes So, depending on which layer you are placing yourself on, one or the other of your statements is wrong. That said, it would be reasonable for WSGI to expose the raw I/O layer, IMHO. "Prettifying" libraries can wrap it inside a BufferedReader if they like. Note that I once proposed generalized prefetching on I/O streams, but it was rejected: https://mail.python.org/pipermail/python-ideas/2010-September/008179.html (skip to the prefetch() proposal) It was in the context of improving streamed unpickling, which is a problem a bit similar - but less horrible - to JSON unserializing; since then, the problem was solved in a different way by adding a framing layer to pickle protocol 4 :-). Regards Antoine. From pje at telecommunity.com Sat Sep 27 20:55:23 2014 From: pje at telecommunity.com (PJ Eby) Date: Sat, 27 Sep 2014 14:55:23 -0400 Subject: [Web-SIG] WSGI2: write callable? In-Reply-To: References: Message-ID: On Sat, Sep 27, 2014 at 12:20 AM, Robert Collins wrote: > We should capture these design principles somewhere FAQ-like, since > many of the folk participating in this rework weren't part of the > original design. A lot of it is in the PEP itself, albeit in ways that seem a lot more obscure now, 10 years later, than they did at the time of writing. It's also spread out among different parts, including the FAQ at the end. > Right now, anything providing the server profile has to cope with > exceptions and translate those to 500 errors, so we have the variation > of 'status and headers may not be provided'. Most middleware can be > oblivious and delegate this to the server via bubble-up. I suspect the > same would work for a default of 200 - 99% of middleware would ignore > it and it would just work. However, I'm not super attached - it was > just an idea. In the limit case, > >>> So a classic example for Trailers is digitally signing streamed >>> content. Using the same strawman API as above: >>> >>> def app(environ): >>> yield {':status': '200} >>> md5sum = md5.new() >>> for bytes in block_reader(open('foo', 'rb'), 65536): >>> md5sum.update(bytes) >>> yield bytes >>> digest = md5sum.hexdigest() >>> signature = sign_bytes(digest.encode('utf8')) >>> yield {'Content-MD5Sum': digest, 'X-Signature': signature} >>> >>> Note that this doesn't need to buffer or use a closure. >> Please bear in mind that another core WSGI design principle is that we >> don't make apps easier to write by making servers and middleware >> harder to write. That kills adoption and growth, because the audience >> that *needs* to adopt WSGI (or any successor standard) is the audience >> of people who write servers and middleware. If a feature is sinfully >> ugly for the app writer, but a thing of beauty for a middleware >> author, we *want* that feature. > > I get that to a degree - I think there is a balance to be struck. Actually, no, there *isn't*. That's the whole point: there is NO balance to be struck. WSGI was never intended to be an API for writing web applications. In fact, not only was it not intended for that, it was *explicitly* an *anti-goal*, and I don't think that any of the conditions that made it an anti-goal have changed. Any feature which is added solely to entice an end-consumer of WSGI (vs. a framework or library implementer) is 100% wasted. Why? Because only the most trivial apps can be written with raw WSGI. Real apps need a thousand tiny *other* features (routing, sessions, authentication, authorization, registration, etc. etc.)... Which means that even if you make an awesome WSGI API, nobody's going to use it. They're going to need libraries or frameworks, no matter what. WSGI itself cannot *possibly* compete with these libraries and frameworks. If WSGI 2 adds features that users want, and library/framework developers can reasonably add those features to *their* APIs, then there is a chance that they will do so. But if they have to throw out their whole existing paradigm to do that, or users have to abandon their framework to adopt the WSGI 2 paradigm, then nothing was really gained by the effort. Basically, going after end users puts you in a "boil the ocean" position. That is, a situation where you must more or less convince everybody to change at the same time in order for the standard to reach critical mass. However, if you are *not* trying to boil the ocean by attracting end users, then anything that you do to benefit them (at the expense of framework, middleware, or server authors) is pure waste, since the incremental strategy (that WSGI was based on in the first place) doesn't depend on end-users using the raw WSGI protocol. As the PEP itself explains: """But the mere existence of a WSGI spec does nothing to address the existing state of servers and frameworks for Python web applications. Server and framework authors and maintainers must actually implement WSGI for there to be any effect. However, since no existing servers or frameworks support WSGI, there is little immediate reward for an author who implements WSGI support. Thus, WSGI must be easy to implement, so that an author's initial investment in the interface can be reasonably low. Thus, simplicity of implementation on both the server and framework sides of the interface is absolutely critical to the utility of the WSGI interface, and is therefore the principal criterion for any design decisions. ***Note, however, that simplicity of implementation for a framework author is not the same thing as ease of use for a web application author.*** WSGI presents an absolutely "no frills" interface to the framework author, because bells and whistles like response objects and cookie handling would just get in the way of existing frameworks' handling of these issues. Again, the goal of WSGI is to facilitate easy interconnection of existing servers and applications or frameworks, not to create a new web framework.""" If you replace "WSGI" with "WSGI 2" in the above, the rationale remains unchanged. >> It's not a fair tradeoff, because only server authors and middleware >> authors *have to* deal with WSGI directly. App authors can use >> libraries to pretty it up, so we don't need to pretty it for them in >> advance -- especially since we don't know what their *personal* idea >> of pretty is going to be. ;-) > > Server authors and middleware authors can use libraries too: we can > write functions to provide common handling for a bunch of stuff: thats > not to say we should make things bad at the API level - we shouldn't - > but it doesn't make sense to me to say that folk writing middleware > cannot use libraries. If the protocol is such that alternate paths have to be followed (the "if" conditions I alluded to), then the only way a library can remove this complexity is to implement a canonical form. But if it is *possible* to have a canonical form that doesn't require the alternate paths, then that means we should make that canonical form the spec in the first place. There is no point to creating alternate possibilities just so we can make a library to take them back out. ;-) As explained in the PEP, we want the protocol *itself* to provide simplicity for implementers who are adding support to existing tools. If libraries are required to implement the protocol, then the people implementing *those* libraries are the people we want to make things simple for. ;-) Sure, it'd be awesome to provide good middleware facilities in a library, but we should design the underlying protocol so that it's not insanely difficult to make those libraries. ;-) >> The above API is cute and clean for the app writer, but for a >> middleware writer it's a barrel of misery. *Every* piece of >> middleware that even wants to *read* anything from the response (let >> alone modify it), now needs to check types of yielded values, >> accumulate headers, and maybe buffer content. And there are many ways >> to write that middleware that will be wrong, but *appear* right >> because the author didn't think of all the ways that an app could >> violate the middleware author's assumptions. > > Hang on, why would they buffer content? Buffering response content is > currently verboten, and I haven't seen any proposal to change that. I > don't understand how phrasing the API as I suggested would lead to > buffering being permitted or required. By "content" I was actually talking about the headers or other metadata. Sorry for the confusion. > If its a method on the response body, the returning a list or > generator no longer works, unless you start poking random attributes > onto things. It would also be inconsistent - why would trailers be a > method on the response, but headers be a dict in the return value? (FWIW, I never proposed making headers a dict. That's a bad idea, IMO.) As for returning a list or generator, I don't see why you can't do e.g. return status, headers, trailing_signature(body, ...) Where trailing_signature is a function that returns an iterator with appropriate annotation, wrapping the original iterable. That works whether body is a list or a generator or some other custom iterable. ("Poking random attributes onto things" isn't a requirement, IOW.) >> Please try to think instead of how you could implement those things in >> a "make it nice" API for app authors. WSGI wasn't made ugly on a >> whim; it's the direct result of some very important design principles. >> While the need for start_response() is gone, many of the other reasons >> for its ugliness remain. >> >> (In any case, you can still implement a generator-based API for >> writing WSGI apps, without needing to make WSGI *itself* be >> implemented that way.) > > I don't think WSGI is ugly, but I do think that things have changed > substantially in the python world since it came to be, and we owe it > to ourselves to investigate whether we can do better now. Sure -- the existence of bytes is an obvious win, as is the dropping of start_response. But if you want WSGI 2 to be *interoperable* with WSGI 1, or more precisely, if we want to support *tunneling* WSGI 2 through a WSGI 1 stack, then the design has to be at least somewhat constrained by WSGI 1. > Is there some documentation about the other reasons that it needs to > be ugly - last thing I want to do is waste folks time suggesting > things that won't work. There is really only one reason, that manifests itself in a variety of constraints. That reason is that the success or failure of the standard rests in the hands of those who implement tools (servers, middleware, libraries, and frameworks), not the hands of those who implement apps. Those are the people whose support is critical, so every decision turns in their favor wherever possible. Even the existence of the start_response()/write() kludge is there because at the time, many existing frameworks offered streaming via some sort of imperative, push-based "writing" API, rather than an iteration-based pulling one. > The original WSGI spec avoiding defining objects on the basis of being > extremely minimal, to ease adoption - and its been a wild success. How > much complexity are we starting to drive though, as we keep avoiding > having an object - tuple return types, iterators with extra > attributes. Would a defined ABC be a burden to implementors these > days? I presume that it was the C servers like mod_python that we > would have harmed previously? Yes. Or nowadays, mod_wsgi. As to whether it would be a burden, I couldn't say. (Also, bear in mind that other C-based servers and gateways integrate with WSGI, e.g. nginx IIRC.) In any case, the burden for *consuming* a response object should be less than the burden of *creating* a request object. Defining custom types in C is more work than just accessing attributes of a returned object. So, I don't see a problem with creating a response object per se. I was just thinking that with middleware, you really want to be able to mix and match what features are being returned with the response, so unless you use `__getattr__` proxying, or it's required that response objects allow arbitrary attributes to be added, then the paradigm "bag of related features in a dictionary" better fits the requirement than "return an object". >> To put it another way, the common case for WSGI always was -- and >> mostly still is -- to return an entire HTTP response in one go, >> without any streaming or buffering or anything of that sort. And >> simple things should be simple, with complex things still being >> possible. > > I would be interesting to get stats on that. The WSGI spec goes to > great pains to require that streaming work and buffering be verboten > (presumably excusable for middleware like JPEG->PNG transformers that > simply cannot avoid buffering) - but even then they are required to > yield a b'' AIUI. The design rule here is STASCTAP: simple things are simple, complex things are possible. Admittedly, the "empty yield" rule is a burden on middleware, but a necessary one to make streaming *possible*. The trade was in favor of framework authors supporting streaming, at the expense of middleware authors. (If I could do it over again, I think I'd prioritize things the other way. That is, weigh the interests of middleware authors more highly than those of framework authors, in the event of a trade-off between the two. Middleware combines the requirements of both sides of the interface, whereas servers and frameworks each have only a one-sided view of things. Prioritizing middleware over either side should produce a better protocol on balance, than trying to directly trade one end against the other.) > But your points about simple and complex are interesting. Middleware > authors need to cater to everything - so making the simple simple > doesn't make it simple for middleware authors - they don't get to opt > out. Its only by making everything as simple - uncomplected[1] - as > possible that we keep things easy for server and middleware authors. Right. > I still disagree that middleware and server authors cannot get a nicer > API through libraries. The different between middleware or server and > apps is that apps can choose not to care about things they don't care > about, whereas middleware and servers have to care - but appropriate > helper functions can still help them. But if we know what helper functions we would want to write, then we can just make the protocol be the result of calling the helpers, instead of making the protocol require the helpers. Then, the *app* would call the helpers, not the middleware, which puts all the wrapping at the edge of the system instead of ubiquitous unwrapping and rewrapping. >> So, let's trim the sharp edges for the poor middleware and server >> developers, rather than polishing the bits that app writers aren't >> going to be using, anyway. (Since most of them are going to be using >> Django, Pyramid, Flask, or whatever the latest hotness is, anyway.) > > Do you have a hitlist of such sharp edges you'd like to see catered > for in this new spec? The ones described in the wsgi_lite docs: 1. People forgetting that the environ is volatile 2. People forgetting to close() 3. The horror that is the stateful nature of the current protocol (all the rules on what can be called when) In wsgi_lite I addressed #1 by providing the binding protocol to map desired request data to keyword arguments. #2, by the "closing" extension, and #3 by switching to a functional paradigm rather than an imperative one. (Thus eliminating any rules on what can be called when, because the response is a return value, not an invocation of something.) > That seems reasonable, presumably because any code we write will not > be backported to older standard libraries. I think it would be a > mistake to not think about the default experience as well though: we > should specify the protocol, and offer a good default API on top of > it. I think maybe you're confused about *whose* default experience is to be catered to. ;-) In my estimation, the framework developers who want to expose their apps as WSGI 2 will be adding metadata to their library, not importing a decorator from the stdlib to do it. All in all, it kind of sounds to me like what you *really* want is to make a user-level API for HTTP/2 applications. And maybe it would be a good idea to do that *first*, without reference to tweaking WSGI. That is, maybe go out and write a nice API with whatever bells and whistles you want to provide to apps, and just implement it for one or two specific front-end servers. *Then*, we would be able to look at a concrete API implementation and say, "okay, how can we make a simple protocol that allows this end user HTTP/2 API to exist, while being minimal for middleware and servers to support?" And finally, we could look at that protocol and say, "okay, can we encapsulate this protocol in such a way that it can be safely tunneled through WSGI 1?" Each of these stages has benefit. If you only get through the first, at least it's possible to do HTTP/2 in Python! If you get through the second, well, maybe it's not WSGI, but at least it's a protocol (SSGI? H2GI?). And so on. I guess what I'm saying is, based on what you seem to be trying to do, I think trying to update WSGI is *way* premature. Even WSGI wasn't proposed in a vacuum: it was based on looking at the APIs provided by existing Python-supporting web servers and required by existing Python web frameworks. So, in the absence of even *one* HTTP/2 framework API to drive the requirements, it's probably premature to propose paradigm shifts in WSGI itself. Does an HTTP/2 server or API for Python even *exist* yet? From pje at telecommunity.com Sat Sep 27 21:43:20 2014 From: pje at telecommunity.com (PJ Eby) Date: Sat, 27 Sep 2014 15:43:20 -0400 Subject: [Web-SIG] WSGI2: write callable? In-Reply-To: References: Message-ID: On Sat, Sep 27, 2014 at 2:55 PM, PJ Eby wrote: > On Sat, Sep 27, 2014 at 12:20 AM, Robert Collins wrote: >> Right now, anything providing the server profile has to cope with >> exceptions and translate those to 500 errors, so we have the variation >> of 'status and headers may not be provided'. Most middleware can be >> oblivious and delegate this to the server via bubble-up. I suspect the >> same would work for a default of 200 - 99% of middleware would ignore >> it and it would just work. However, I'm not super attached - it was >> just an idea. > > In the limit case, > >> >>>> So a classic example for Trailers is digitally signing streamed >>>> content. Using the same strawman API as above: >>>> >>>> def app(environ): >>>> yield {':status': '200} >>>> md5sum = md5.new() >>>> for bytes in block_reader(open('foo', 'rb'), 65536): >>>> md5sum.update(bytes) >>>> yield bytes >>>> digest = md5sum.hexdigest() >>>> signature = sign_bytes(digest.encode('utf8')) >>>> yield {'Content-MD5Sum': digest, 'X-Signature': signature} >>>> >>>> Note that this doesn't need to buffer or use a closure. (Oops. The above was some stuff I forgot to delete while editing.) From guido at python.org Sat Sep 27 23:34:22 2014 From: guido at python.org (Guido van Rossum) Date: Sat, 27 Sep 2014 14:34:22 -0700 Subject: [Web-SIG] WSGI: allowing short reads In-Reply-To: References: Message-ID: I am taking full responsibility for this inconsistency. The original read(n) used stdio's fread(), which reads exactly n bytes or until EOF, whichever comes first. The switch to 3.0 might have been a good time to fix this, but we didn't, and now it's too late. If I had to do it over again I would have read(n) return up to n bytes using at most 1 syscall, and readexactly(n) return n bytes or bust (raising EOFError if EOF is hit before n bytes are seen). This is what asyncio streams use and what I recommend for stream-like APIs that don't require strict compatibility. Note that read() without parameter and read(-1) are also special (reading everything until EOF). I think this is unambiguous and doesn't need to be fixed. On Sat, Sep 27, 2014 at 5:00 AM, Antoine Pitrou wrote: > > Hi, > > Robert Collins writes: > > > > https://github.com/python-web-sig/wsgi-ng/issues/5 > > > > tl;dr - we don't specify whether read(size) has to return size bytes > > or just not more than size, today. the IO library is clear that > > read(n) returns up to n, and also offers read1 that guarantees only > > one read call. > > I think you've got things mixed up. There are two different things in > "the IO library" (which is really the 3.x IO stack): > > * buffered binary I/O has read(n) and read1(n): > - read(n) will block until n bytes are received (except for non-blocking > fds) > - read1(n) will issue at most one system call and can return fewer than > n bytes > > * raw binary I/O only has read(n): > - read(n) will issue at most one system call and can return fewer than > n bytes > > So, depending on which layer you are placing yourself on, one or the > other of your statements is wrong. > > That said, it would be reasonable for WSGI to expose the raw I/O layer, > IMHO. "Prettifying" libraries can wrap it inside a BufferedReader if > they like. > > Note that I once proposed generalized prefetching on I/O streams, but > it was rejected: > https://mail.python.org/pipermail/python-ideas/2010-September/008179.html > (skip to the prefetch() proposal) > > It was in the context of improving streamed unpickling, which is > a problem a bit similar - but less horrible - to JSON unserializing; > since then, the problem was solved in a different way by adding a > framing layer to pickle protocol 4 :-). > > Regards > > Antoine. > > > _______________________________________________ > Web-SIG mailing list > Web-SIG at python.org > Web SIG: http://www.python.org/sigs/web-sig > Unsubscribe: > https://mail.python.org/mailman/options/web-sig/guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertc at robertcollins.net Sat Sep 27 23:38:11 2014 From: robertc at robertcollins.net (Robert Collins) Date: Sun, 28 Sep 2014 10:38:11 +1300 Subject: [Web-SIG] WSGI2: write callable? In-Reply-To: References: Message-ID: I think we're uncovering important assumptions / facts here. For clarity: I'm not interested in a nice API for HTTP/2. I want HTTP/2 and its full featureset to be *possible*, *efficient* and *clear* in a protocol that can replace WSGI - and do so with a fair chance of adoption. Ditto websockets. Neither is possible within WSGI today: the base protocol is insufficient, and every implementation of either HTTP/2 or Websockets for app writers only works by depending on extensions that don't meet the basic design principles - for instance exposing the actual server socket as an extension, which mod_wsgi cannot do. So, basic axioms I've been working from: * HTTP/2 cannot be tunnelled through HTTP/1: it can be downgraded, but not tunnelled. An HTTP/2->HTTP1.1->HTTP/2 chain is not capable of the same results as a straight HTTP/2 connection (or chain). * This almost certainly applies to WSGI as well: WSGI2 -> WSGI1 -> WSGI2 will have to downgrade to WSGI1. Some things may be tunnelable [and we can try to do that], but the full set of features almost certainly cannot. >From this I drew the proposal to do interop by providing an API [not protocol] that provides WSGI1 on the top and 2 on the bottom, and another that does the reverse: allowing folk to upgrade individual middleware piecemeal, and get the full benefits whenever they have a fully upgraded stack. E.g. leave upgrading debug middleware to the end. Perhaps this is misguided and implementors will reject such assistance? On 28 September 2014 07:55, PJ Eby wrote: > On Sat, Sep 27, 2014 at 12:20 AM, Robert Collins > wrote: >> We should capture these design principles somewhere FAQ-like, since >> many of the folk participating in this rework weren't part of the >> original design. > > A lot of it is in the PEP itself, albeit in ways that seem a lot more > obscure now, 10 years later, than they did at the time of writing. > It's also spread out among different parts, including the FAQ at the > end. :) - I am familiar with PEP, so yeah, does feel a bit obscure :). Thank you for chiming in to reinforce them. > Any feature which is added solely to entice an end-consumer of WSGI > (vs. a framework or library implementer) is 100% wasted. I understand that argument, but... ... > If WSGI 2 adds features that users want, and library/framework > developers can reasonably add those features to *their* APIs, then > there is a chance that they will do so. But if they have to throw out > their whole existing paradigm to do that, or users have to abandon > their framework to adopt the WSGI 2 paradigm, then nothing was really > gained by the effort. libraries and frameworks exist for the same users. WSGI's ability to say 'and this is up to library/framework developers' is contingent on the protocol being *sufficient* for folk to do that. I suspect a bunch of our discussions are going to end up being around whether specific changes are necessary or things libraries can do. > Basically, going after end users puts you in a "boil the ocean" > position. That is, a situation where you must more or less convince > everybody to change at the same time in order for the standard to > reach critical mass. I had hoped not, due to proposing that we provide an API [not protocol] for adapting between the protocols. That would exist solely to make implementors have an easier time bringing support in incrementally. So - I think you're misinterpreting my thrust as being 'after end users' - I'm not: I'm squarely focused on the implementation problems of server and middleware authors. > However, if you are *not* trying to boil the ocean by attracting end > users, then anything that you do to benefit them (at the expense of > framework, middleware, or server authors) is pure waste, since the > incremental strategy (that WSGI was based on in the first place) > doesn't depend on end-users using the raw WSGI protocol. As the PEP > itself explains: >... > If you replace "WSGI" with "WSGI 2" in the above, the rationale > remains unchanged. Sure. >>> The above API is cute and clean for the app writer, but for a >>> middleware writer it's a barrel of misery. *Every* piece of >>> middleware that even wants to *read* anything from the response (let >>> alone modify it), now needs to check types of yielded values, >>> accumulate headers, and maybe buffer content. And there are many ways >>> to write that middleware that will be wrong, but *appear* right >>> because the author didn't think of all the ways that an app could >>> violate the middleware author's assumptions. >> >> Hang on, why would they buffer content? Buffering response content is >> currently verboten, and I haven't seen any proposal to change that. I >> don't understand how phrasing the API as I suggested would lead to >> buffering being permitted or required. > > By "content" I was actually talking about the headers or other > metadata. Sorry for the confusion. No worries. Right now buffering of headers is required - the whole 'until the iterator returns a non-empty bytestring' bit - sure, I'd like to get rid of that. I still don't see a case where the generator based protocol would force buffering of headers [outside of the context of middleware that actually wants to buffer headers]. >> If its a method on the response body, the returning a list or >> generator no longer works, unless you start poking random attributes >> onto things. It would also be inconsistent - why would trailers be a >> method on the response, but headers be a dict in the return value? > > (FWIW, I never proposed making headers a dict. That's a bad idea, IMO.) Could you enlarge on that? There have been lots of [often security related] bugs in implementations of HTTP/1.x which were due to protocol handlers *not* treating the headers as dicts. Things like appending a header that cannot be repeated where in an N-tier deployed system the first layer consults the last header and the second layer consults the first. HTTP's header model could be modelled as {header: [value, ...]} or even more strictly as {header: value_or_list_value}. I'm going to guess and say 'a list is necessary, a dict isn't, and someone can write middleware to sanitise response headers' ? > As for returning a list or generator, I don't see why you can't do e.g. > > return status, headers, trailing_signature(body, ...) > > Where trailing_signature is a function that returns an iterator with > appropriate annotation, wrapping the original iterable. That works > whether body is a list or a generator or some other custom iterable. > > ("Poking random attributes onto things" isn't a requirement, IOW.) yield from in recent pythons could make that fairly efficient, ok. Still leaves the inconsistency between an immediate value for headers and a late bound value for trailers but perhaps thats ok. .. > Sure -- the existence of bytes is an obvious win, as is the dropping > of start_response. But if you want WSGI 2 to be *interoperable* with > WSGI 1, or more precisely, if we want to support *tunneling* WSGI 2 > through a WSGI 1 stack, then the design has to be at least somewhat > constrained by WSGI 1. Ok, so I don't think we *can* do that, and in fact I think we shouldn't. I think we *can* do the following: - make WSGI2 degrade to WSGI1 via an adapter - tunnel WSGI1 through WSGI2 I may be wrong, and if we're clever enough - great. OTOH some of the changes we're discussing - like getting rid of start_response and making bidirectional channels possible - are pretty fundamentally different to WSGI1, and I'd be worried about a protocol that requires middleware authors to write to *both* WSGI1 and WSGI2 at the same time. I think thats an unnecessary burden and one that will hinder adoption. > So, I don't see a problem with creating a response object per se. I > was just thinking that with middleware, you really want to be able to > mix and match what features are being returned with the response, so > unless you use `__getattr__` proxying, or it's required that response > objects allow arbitrary attributes to be added, then the paradigm "bag > of related features in a dictionary" better fits the requirement than > "return an object". Ok. >>> So, let's trim the sharp edges for the poor middleware and server >>> developers, rather than polishing the bits that app writers aren't >>> going to be using, anyway. (Since most of them are going to be using >>> Django, Pyramid, Flask, or whatever the latest hotness is, anyway.) >> >> Do you have a hitlist of such sharp edges you'd like to see catered >> for in this new spec? > > The ones described in the wsgi_lite docs: > > 1. People forgetting that the environ is volatile > 2. People forgetting to close() > 3. The horror that is the stateful nature of the current protocol (all > the rules on what can be called when) > > In wsgi_lite I addressed #1 by providing the binding protocol to map > desired request data to keyword arguments. #2, by the "closing" > extension, and #3 by switching to a functional paradigm rather than an > imperative one. (Thus eliminating any rules on what can be called > when, because the response is a return value, not an invocation of > something.) Has wsgi_lite been picked up by server and middleware authors? Do we have any feedback on how well its working? > All in all, it kind of sounds to me like what you *really* want is to > make a user-level API for HTTP/2 applications. And maybe it would be > a good idea to do that *first*, without reference to tweaking WSGI. So, my personal driver is that I have multiple use cases, most but not all of which are end user use cases, that depend on HTTP/2 // will benefit from HTTP/2. A user level API is certainly a thing that will need to exist, but all the servers around so far are just degrading HTTP/2 to WSGI - the lingua franca. One perhaps unintended consequence of WSGI is that its become that lingua franca, and many things are internally structured around middleware stacks :). So the first thing that needs to be done is a WSGI like thing and internal code shuffling. You're right though that more implementor experience would be good - I'm hoping do be doing that on the basis of drafts and discussion. ... > And finally, we could look at that protocol and say, "okay, can we > encapsulate this protocol in such a way that it can be safely tunneled > through WSGI 1?" If it can :). > Each of these stages has benefit. If you only get through the first, > at least it's possible to do HTTP/2 in Python! If you get through the > second, well, maybe it's not WSGI, but at least it's a protocol (SSGI? > H2GI?). And so on. > I guess what I'm saying is, based on what you seem to be trying to do, > I think trying to update WSGI is *way* premature. Even WSGI wasn't > proposed in a vacuum: it was based on looking at the APIs provided by > existing Python-supporting web servers and required by existing Python > web frameworks. So, in the absence of even *one* HTTP/2 framework API > to drive the requirements, it's probably premature to propose paradigm > shifts in WSGI itself. So, there are multiple examples of websockets today, which share much in common with HTTP/2. All of them require server support, and tunnel through WSGI in ways that are liable to break (e.g. a middleware that remotes objects will almost certainly fail to handle the raw socket). > Does an HTTP/2 server or API for Python even *exist* yet? Yes. http://nghttp2.org/documentation/package_README.html#python-bindings The model is of a handler class, and four events - headers, data, request fully received, stream closed. It supports push, but in a way that prevents implementing a notification server such as https://tools.ietf.org/html/draft-thomson-webpush-http2-00 specifies. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From robertc at robertcollins.net Sat Sep 27 23:43:33 2014 From: robertc at robertcollins.net (Robert Collins) Date: Sun, 28 Sep 2014 10:43:33 +1300 Subject: [Web-SIG] WSGI: allowing short reads In-Reply-To: References: Message-ID: On 28 September 2014 00:00, Antoine Pitrou wrote: > > Hi, > > Robert Collins writes: >> >> https://github.com/python-web-sig/wsgi-ng/issues/5 >> >> tl;dr - we don't specify whether read(size) has to return size bytes >> or just not more than size, today. the IO library is clear that >> read(n) returns up to n, and also offers read1 that guarantees only >> one read call. > > I think you've got things mixed up. There are two different things in > "the IO library" (which is really the 3.x IO stack): > > * buffered binary I/O has read(n) and read1(n): > - read(n) will block until n bytes are received (except for non-blocking > fds) > - read1(n) will issue at most one system call and can return fewer than > n bytes > > * raw binary I/O only has read(n): > - read(n) will issue at most one system call and can return fewer than > n bytes > > So, depending on which layer you are placing yourself on, one or the > other of your statements is wrong. Ugh. Thanks! > That said, it would be reasonable for WSGI to expose the raw I/O layer, > IMHO. "Prettifying" libraries can wrap it inside a BufferedReader if > they like. We often can't expose the real underlying socket, but having the semantics of raw binary I/O for the file-like things we do expose makes more sense to me. > Note that I once proposed generalized prefetching on I/O streams, but > it was rejected: > https://mail.python.org/pipermail/python-ideas/2010-September/008179.html > (skip to the prefetch() proposal) > > It was in the context of improving streamed unpickling, which is > a problem a bit similar - but less horrible - to JSON unserializing; > since then, the problem was solved in a different way by adding a > framing layer to pickle protocol 4 :-). BufferedReader solves this well though, doesn't it? -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From solipsis at pitrou.net Sat Sep 27 23:45:32 2014 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 27 Sep 2014 23:45:32 +0200 Subject: [Web-SIG] WSGI: allowing short reads References: Message-ID: <20140927234532.4719f437@fsol> On Sun, 28 Sep 2014 10:43:33 +1300 Robert Collins wrote: > > > > It was in the context of improving streamed unpickling, which is > > a problem a bit similar - but less horrible - to JSON unserializing; > > since then, the problem was solved in a different way by adding a > > framing layer to pickle protocol 4 :-). > > BufferedReader solves this well though, doesn't it? Assuming it's the only one accessing the raw stream, yes. Regards Antoine. From pje at telecommunity.com Sun Sep 28 21:32:19 2014 From: pje at telecommunity.com (PJ Eby) Date: Sun, 28 Sep 2014 15:32:19 -0400 Subject: [Web-SIG] WSGI2: write callable? In-Reply-To: References: Message-ID: On Sat, Sep 27, 2014 at 5:38 PM, Robert Collins wrote: > I think we're uncovering important assumptions / facts here. Indeed! > For clarity: I'm not interested in a nice API for HTTP/2. I want > HTTP/2 and its full featureset to be *possible*, *efficient* and > *clear* in a protocol that can replace WSGI - and do so with a fair > chance of adoption. Cool. Then my suggestion would be: don't use WSGI as a basis for designing that protocol. Start with something that's a natural fit for the HTTP/2 model, which -- from what I can tell so far -- is nothing like WSGI's simple request/response model. > Ditto websockets. Neither is possible within WSGI > today: the base protocol is insufficient, and every implementation of > either HTTP/2 or Websockets for app writers only works by depending on > extensions that don't meet the basic design principles - for instance > exposing the actual server socket as an extension, which mod_wsgi > cannot do. Right. I do think it might be worthwhile creating a spec for how to create safe "middleware-bypassing" and "rich object" server extensions within WSGI, to allow limited use of HTTP/2 features. > * This almost certainly applies to WSGI as well: WSGI2 -> WSGI1 -> > WSGI2 will have to downgrade to WSGI1. Some things may be tunnelable > [and we can try to do that], but the full set of features almost > certainly cannot. That depends on what you mean by "WSGI2". I think an HTTP/2 gateway API is a different animal than "WSGI2" per se. I think there may be room for a request/response WSGI2, distinct from a Python HTTP/2 API, and (mostly) interoperable with WSGI 1. That doesn't mean that the HTTP/2 API might not win over the market and supplant WSGI1/2, I'm just not convinced that it should be positioned as WSGI's successor. (At least, not until I've seen it... ;-) ) > From this I drew the proposal to do interop by providing an API [not > protocol] that provides WSGI1 on the top and 2 on the bottom, and > another that does the reverse: allowing folk to upgrade individual > middleware piecemeal, and get the full benefits whenever they have a > fully upgraded stack. E.g. leave upgrading debug middleware to the > end. Perhaps this is misguided and implementors will reject such > assistance? My suggestion would be to make a good HTTP/2 API without any WSGI legacy, and then develop a set of middleware-safe server extensions to provide HTTP 2 features on WSGI 1. Here's an idea about how you can safely do that, for trailers, push, and even websockets: 1. Define a server extension that accepts metadata or callbacks, and returns a string (or array of strings if the extension applies to the body) 2. To activate the effect, the app puts the string in a header (e.g. "Content-Type: application/x-wsgi-rich-body; id=sfdfs876654") and returns it in the body as well (e.g. ['sfdfs876654']) 3. If the header or body string reaches the origin server, apply the metadata or invoke the callback(s) 4. If it doesn't, use the response the middleware provided instead 5. Discard all registered metadata or callbacks upon completion of the request This model can be used for: * Websockets - register a callback that receives the websocket, which will be run in place of the middleware response * Trailers - register a callback to generate the body and trailers * Associated content - register metadata to push the content, listed as header strings Heck -- you can create a *generalized* escape path to allow a HTTP/2 app API instead of doing one-off protocols like this. Imagine this decorator: def http2_under_wsgi(http2_app): def wrapped(environ, start_response): try: upgrade = environ['http2.upgrade'] except KeyError: raise RuntimeError("HTTP/2 API not available") return upgrade(http2_app, start_response) @http2_under_wsgi def my_http2_app(...): yield some.thing(...) ??? # whatever the super cool HTTP2 API does Now, you just use @http2_under_wsgi as a wrapper to convert an HTTP/2 app to a WSGI 1 app. The server environment just invokes start_response with a special status, headers, return body, which contain tag strings registered to the given `http2_app`. In order to actually handle the request, it looks up the tag it gets (since middleware could be running multiple subrequests) to find the http2_app it's going to run. It then runs that app under the HTTP/2 API. This model lets you run most of your app under plain WSGI, with escapes as necessary, and even allows WSGI middleware for routing, authentication, and other pre-processing; you just can't use response-altering middleware. In addition, you can write WSGI 1 middleware that still intercepts HTTP/2 API by replacing the `http2.upgrade` key and wrapping the apps being passed up to the server extension. I hope this helps to explain why I don't think you should try to use WSGI 1 as a basis for HTTP/2. You can and should bypass it altogether, especially since it should be able to be done in a way that lets ANY existing WSGI 1 app framework "escape" to full HTTP/2, where available. And then, HTTP/2 needn't be burdened by any of the many compromises and legacy crufts of WSGI and its CGI heritage. You'll still need an implementation of WSGI *in terms of* the HTTP/2 API, plus the "escape" hook, but I don't see any reason why HTTP/2 *needs* to be even remotely WSGI-like. >> (FWIW, I never proposed making headers a dict. That's a bad idea, IMO.) > > Could you enlarge on that? There have been lots of [often security > related] bugs in implementations of HTTP/1.x which were due to > protocol handlers *not* treating the headers as dicts. Things like > appending a header that cannot be repeated where in an N-tier deployed > system the first layer consults the last header and the second layer > consults the first. HTTP's header model could be modelled as > {header: [value, ...]} or even more strictly as {header: > value_or_list_value}. I'm going to guess and say 'a list is necessary, > a dict isn't, and someone can write middleware to sanitise response > headers' ? Actually, the reason is that one of the WSGI design principles is that it tries to stay as close as possible to the wire protocol it was based on. HTTP/1 headers are a series of lines, so WSGI headers are a series of lines. If some browser crashes when you put the headers in the "wrong" order (from its perspective), then WSGI should not create any obstacles to sending them in the "right" order (i.e., the one that doesn't make e.g. IE crash). (I'm not saying that such an issue actually exists/existed, just that a list was chosen based on the principle that WSGI should give the app as much control over the output stream as possible. The stream blocking and timing requirements exist for the same reason.) > Has wsgi_lite been picked up by server and middleware authors? Do we > have any feedback on how well its working? Nope, I never got around to promoting it, apart from a blog post or two introducing the idea a few years ago. ;-) > So, there are multiple examples of websockets today, which share much > in common with HTTP/2. All of them require server support, and tunnel > through WSGI in ways that are liable to break (e.g. a middleware that > remotes objects will almost certainly fail to handle the raw socket). So we should definitely fix that, by defining a safe "rich server API upgrade" escape for WSGI. Hm.... maybe your new API should be the "Rich Server Gateway Interface", or RSGI -- pronounced "risky". ;-) Anyway, "upgrade escapes" are a generic concept, and we can define that independently of *what* API you upgrade to, so that might be a good idea to work on soon, as it could be used for websockets and the like today, as a standardized WSGI extension. >> Does an HTTP/2 server or API for Python even *exist* yet? > > Yes. http://nghttp2.org/documentation/package_README.html#python-bindings > > The model is of a handler class, and four events - headers, data, > request fully received, stream closed. It supports push, but in a way > that prevents implementing a notification server such as > https://tools.ietf.org/html/draft-thomson-webpush-http2-00 specifies. This looks like a fairly reasonable approach to an API. Given that we'll still have WSGI for simple cases, I don't see an issue with RSGI having an event-driven model with various APIs going in both directions. But I'll probably bow out of most discussions about defining RSGI unless I see something that relates to "lessons learned" in WSGI. I worry a little that a RSGI design is still premature, given only ONE Python API, but if we have rich escapes in WSGI, then there will be room for servers to develop experimental HTTP/2 APIs that can then form a basis for RSGI later. Yeah, that really looks like the way forward: define a safe way to escape WSGI from inside of it, so that server developers aren't forced to dumb down HTTP/2 to WSGI, in order to provide rich HTTP/2 APIs. What do you think? From robertc at robertcollins.net Mon Sep 29 04:09:39 2014 From: robertc at robertcollins.net (Robert Collins) Date: Mon, 29 Sep 2014 15:09:39 +1300 Subject: [Web-SIG] WSGI2: write callable? In-Reply-To: References: Message-ID: On 29 September 2014 08:32, PJ Eby wrote: > On Sat, Sep 27, 2014 at 5:38 PM, Robert Collins > wrote: >> I think we're uncovering important assumptions / facts here. > > Indeed! > > >> For clarity: I'm not interested in a nice API for HTTP/2. I want >> HTTP/2 and its full featureset to be *possible*, *efficient* and >> *clear* in a protocol that can replace WSGI - and do so with a fair >> chance of adoption. > > Cool. Then my suggestion would be: don't use WSGI as a basis for > designing that protocol. Start with something that's a natural fit > for the HTTP/2 model, which -- from what I can tell so far -- is > nothing like WSGI's simple request/response model. Thats a fair point. I have not been constrained by WSGI today in thinking about this - but since this effort is about updating the standard folk write to, for web server -> gateway -> app plumbing in Python, WSGI is, for better or worse, the touchstone folk have. WSGI's simple request/response model has been unable to fully handle the modern web since HTTP/1.1 came out (AFAIK none of the gateways have managed to make chunked uploads work right, and trailers are not supported). Thats not a bad thing about WSGI, and putting the union of requirements into a spec can make it unwieldy (see RFC2616 for a classic example :)) - but while we have a lot of frameworks that are a) composed of WSGI adapters and b) WSGI on the top, or WSGI on the bottom, they're not following WSGI all that precisely, because WSGI is too restrictive. And the web has moved on with Websockets in 2011 and HTTP/2 any day now. I think its clear from the broad interest we've got that folk are interested in a new spec. Whats in a name? We could call it something else. RSGI as you humourously suggest, or we can call it WSGI. I think WSGI is the right name, because I don't think we want to aim for a situation where folk writing new servers write both WSGI and $NEWTHING support. They should be able to pick one, write to it well, and have their users choose to downgrade the environment if they have legacy things that are not yet upgraded. So - I'm going to keep drafting this as WSGI2, unless there is consensus here that the name should be different. > Right. I do think it might be worthwhile creating a spec for how to > create safe "middleware-bypassing" and "rich object" server extensions > within WSGI, to allow limited use of HTTP/2 features. That might be an interesting thing; I have no real interest in writing it at this point: my intent is to provide a new thing, which may be very similar, or may be strikingly different - thats what this SIG will come up with - which can contain WSGI1 middleware safely via some adapter. I don't have interest in writing a 'do HTTP/2 features from within WSGI1' effort, because I think its a lot of work for little-if-any-gain: we have to have servers that can speak the new wire protocols before we can use the new features, and that means the top of our stack will be $NEWTHING anyway. There are some exceptions, such as the mod_spdy hack to tunnel awareness of to-push resources, but its not clear that that will do the right thing in all circumstances with oblivious middleware, no matter how its spelt in code. [Because, whatever failure mode we choose by default, some middleware will want the other one - at which point its not oblivious, and may as well just be upgraded]. >> * This almost certainly applies to WSGI as well: WSGI2 -> WSGI1 -> >> WSGI2 will have to downgrade to WSGI1. Some things may be tunnelable >> [and we can try to do that], but the full set of features almost >> certainly cannot. > > That depends on what you mean by "WSGI2". I think an HTTP/2 gateway > API is a different animal than "WSGI2" per se. I think there may be > room for a request/response WSGI2, distinct from a Python HTTP/2 API, > and (mostly) interoperable with WSGI 1. That doesn't mean that the > HTTP/2 API might not win over the market and supplant WSGI1/2, I'm > just not convinced that it should be positioned as WSGI's successor. > (At least, not until I've seen it... ;-) ) Thats fair enough, but in the absence of a better name - and see above: having the need for server and middleware authors to only need to care about one protocol is a key design point - I think calling it WSGI2 is better than calling it something new. If its going to make the discussion hard, I'm ok calling it e.g. NNGI (no name gateway interface) until we're done. > >> From this I drew the proposal to do interop by providing an API [not >> protocol] that provides WSGI1 on the top and 2 on the bottom, and >> another that does the reverse: allowing folk to upgrade individual >> middleware piecemeal, and get the full benefits whenever they have a >> fully upgraded stack. E.g. leave upgrading debug middleware to the >> end. Perhaps this is misguided and implementors will reject such >> assistance? > > My suggestion would be to make a good HTTP/2 API without any WSGI > legacy, and then develop a set of middleware-safe server extensions to > provide HTTP 2 features on WSGI 1. Here's an idea about how you can > safely do that, for trailers, push, and even websockets: Your adapter sketch there is a useful escape hatch approach. It may have some use. However the downside is that its going to break on a lot of middleware. The approach I'm thinking of is more: def wsgi2_under_wsgi(app): def converter(environ, start_response): if is_really_wsgi2(environ): # fast path to detect things that were wrapped unnecessarily return app(environ) # <... stuff oh my gosh stuff to convert the protocols, downgrading all features> return converter def wsgi_under_wsgi2(app): # export a WSGI2 server as a WSGI1 server def start_response(status, headers): try: # and so on def converted_environ(environ): .... # include a marker in here to let wsgi2_under_wsgi fast-path it def wsgi2_to_wsgi(environ): return app(converted_environ(environ), start_response) Really, I think we're agreeing on 95% here, but I'm biasing for having a majority of WSGI2 eventually, whereas you seem to be biasing for having a majority of WSGI indefinitely. The reason I want to bias for the long term, is that it will be with us for a long while. We need to make incremental deployment easy - and that may well mean tunnelling some things. The role of the spec here though is to define the protocol by which folk can write tunnellers *after* we get the thing working. Perhaps thats exactly what you mean: decouple an HTTP/2+Websockets+HTTP/1.x protocol from tunnelling new features through WSGI for legacy deployments. If thats what you mean, then I agree - and thats what I'm working on :) >>> (FWIW, I never proposed making headers a dict. That's a bad idea, IMO.) >> >> Could you enlarge on that? There have been lots of [often security >> related] bugs in implementations of HTTP/1.x which were due to >> protocol handlers *not* treating the headers as dicts. Things like >> appending a header that cannot be repeated where in an N-tier deployed >> system the first layer consults the last header and the second layer >> consults the first. HTTP's header model could be modelled as >> {header: [value, ...]} or even more strictly as {header: >> value_or_list_value}. I'm going to guess and say 'a list is necessary, >> a dict isn't, and someone can write middleware to sanitise response >> headers' ? > > Actually, the reason is that one of the WSGI design principles is that > it tries to stay as close as possible to the wire protocol it was > based on. HTTP/1 headers are a series of lines, so WSGI headers are a > series of lines. If some browser crashes when you put the headers in > the "wrong" order (from its perspective), then WSGI should not create > any obstacles to sending them in the "right" order (i.e., the one that > doesn't make e.g. IE crash). > > (I'm not saying that such an issue actually exists/existed, just that > a list was chosen based on the principle that WSGI should give the app > as much control over the output stream as possible. The stream > blocking and timing requirements exist for the same reason.) Ok, so implementor experience in the wild has taught us that this is a bad idea ;). http://tools.ietf.org/html/rfc7230#section-3.2.2 - the protocol defines wire order of headers as undefined, except that the relative order of a) list headers and b) set-cookie needs to be preserved. {headername: [value, ...]} is a superset that would model this but be a lot harder to get wrong for middleware. The IE crashing scenario is not one I'm worried about because intermediaries like Squid and Apache have been normalising and altering headers for a /very/ long time. The websockets RFC explicitly permits arbitrary orders (after we had a long discussion about HTTP semantics during the spec process :)). >> So, there are multiple examples of websockets today, which share much >> in common with HTTP/2. All of them require server support, and tunnel >> through WSGI in ways that are liable to break (e.g. a middleware that >> remotes objects will almost certainly fail to handle the raw socket). > > So we should definitely fix that, by defining a safe "rich server API > upgrade" escape for WSGI. Hm.... maybe your new API should be the > "Rich Server Gateway Interface", or RSGI -- pronounced "risky". ;-) So that would/might address the breakiness but it wouldn't standardise the upgraded protocol(s) - the network effect is where the value is: folk can work around bugs on a case by case basis. >>> Does an HTTP/2 server or API for Python even *exist* yet? >> >> Yes. http://nghttp2.org/documentation/package_README.html#python-bindings >> >> The model is of a handler class, and four events - headers, data, >> request fully received, stream closed. It supports push, but in a way >> that prevents implementing a notification server such as >> https://tools.ietf.org/html/draft-thomson-webpush-http2-00 specifies. > > This looks like a fairly reasonable approach to an API. Given that > we'll still have WSGI for simple cases, I don't see an issue with RSGI > having an event-driven model with various APIs going in both > directions. But I'll probably bow out of most discussions about > defining RSGI unless I see something that relates to "lessons learned" > in WSGI. I worry a little that a RSGI design is still premature, > given only ONE Python API, but if we have rich escapes in WSGI, then > there will be room for servers to develop experimental HTTP/2 APIs > that can then form a basis for RSGI later. > > Yeah, that really looks like the way forward: define a safe way to > escape WSGI from inside of it, so that server developers aren't forced > to dumb down HTTP/2 to WSGI, in order to provide rich HTTP/2 APIs. > What do you think? I worry that that leaves us with a lingua franca which we're expecting everyone to escape from. That doesn't seem like a great place to aim it. It would be equivalent to HTTP/2 requiring HTTP/1 on all connections and then working well after that. What HTTP/2 has done instead is to define both a no-overhead direct-to-HTTP/2 handshake, *and* an upgrade handshake. Doesn't matter which you use - but the direct one (TLS/ALPN) is less round trips vs TLS + HTTP/1 + upgrade and more secure vs HTTP/1. [Encryption isn't a hard requirement of HTTP/2.... but a number of big browser vendors have said they won't implement the non-encryption codepath. So it is an effective requirement outside of the plumbing of webapps. There's been enough ideas put forward in this thread that I need to sit down and do some experiments. I want to try out context managers as a replacement for the close idiom, I want to try pure generator based APIs a little. I'd very much appreciate specific examples of middleware that you believe are representative of the sorts of issues folk will encounter, so that I can compare and contrast the implications of different design decisions on them. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From frank at chagford.com Mon Sep 29 15:26:22 2014 From: frank at chagford.com (Frank Millman) Date: Mon, 29 Sep 2014 15:26:22 +0200 Subject: [Web-SIG] Combine wsgi and asyncio - possible? Message-ID: <66125E21D652445B8489D72130207E01@frank> Hi all I am developing a business/accounting application. It is not a web server in the conventional sense, but it uses http, and clients connect to it via a web browser. The server responds to an initial connection by sending a block of javascript which uses on_load() to display a welcome page. After that, all communication is handled by ajax-style messages passed between server and client. At no point is a new page requested or reloaded. My first attempt was written using the wsgi protocol and the cherrypy wsgi server. It worked fine, but things progressed. My second version is written using python 3.4 and asyncio. In theory it could also use wsgi, but I use a particular technique which I think is incompatible with wsgi, so I handle the requests directly. It works in my testing environment, but in the real world users will not want to deploy my app as a stand-alone http server, so I need to find a better solution. I would like to explain the technique I am using and the reason for it. Perhaps someone can suggest how this can be done in a wsgi-compliant manner. Communication between client and server follows a typical gui event loop. The client waits for a user action, then sends a message to the server with the information. The server processes the information and sends a response, which the client receives and redisplays to the user. Sometimes one user action generates more than one 'event'. I package these up into a list and send them to the server in one message. As the server processes the events, it can result in multiple responses to the client. These are also packaged up and sent in one message. However, it can happen that while the server is working through the events, it needs to send a message to the client to pop up a dialog box, ask a question, and get a response before proceeding. With asyncio I can create a Future to set up and send the message, and use 'yield from asyncio.wait_for(...)' to wait for the response. I am using asyncio.start_server(). For each request, the handler is passed a client_reader and a client_writer. Normally the writer is used to write the response to the original request, but if I need to ask a question, I use the writer to send the 'dialog box' message. When I get the response, I take the new client_writer and pass it back to the original request handler for it to complete the request. As I understand it, wsgi requires me to actually 'return' the response, so I don't have the opportunity to call 'yield from', and I do not get access to the writer object. Any suggestions welcome. Frank Millman From pje at telecommunity.com Mon Sep 29 21:27:16 2014 From: pje at telecommunity.com (PJ Eby) Date: Mon, 29 Sep 2014 15:27:16 -0400 Subject: [Web-SIG] WSGI2: write callable? In-Reply-To: References: Message-ID: On Sun, Sep 28, 2014 at 10:09 PM, Robert Collins wrote: > On 29 September 2014 08:32, PJ Eby wrote: >> On Sat, Sep 27, 2014 at 5:38 PM, Robert Collins >> wrote: >>> I think we're uncovering important assumptions / facts here. >> >> Indeed! >> >> >>> For clarity: I'm not interested in a nice API for HTTP/2. I want >>> HTTP/2 and its full featureset to be *possible*, *efficient* and >>> *clear* in a protocol that can replace WSGI - and do so with a fair >>> chance of adoption. >> >> Cool. Then my suggestion would be: don't use WSGI as a basis for >> designing that protocol. Start with something that's a natural fit >> for the HTTP/2 model, which -- from what I can tell so far -- is >> nothing like WSGI's simple request/response model. > > Thats a fair point. I have not been constrained by WSGI today in > thinking about this - but since this effort is about updating the > standard folk write to, for web server -> gateway -> app plumbing in > Python, WSGI is, for better or worse, the touchstone folk have. > > WSGI's simple request/response model has been unable to fully handle > the modern web since HTTP/1.1 came out (AFAIK none of the gateways > have managed to make chunked uploads work right, and trailers are not > supported). Thats not a bad thing about WSGI, and putting the union of > requirements into a spec can make it unwieldy (see RFC2616 for a > classic example :)) - but while we have a lot of frameworks that are > a) composed of WSGI adapters and b) WSGI on the top, or WSGI on the > bottom, they're not following WSGI all that precisely, because WSGI is > too restrictive. And the web has moved on with Websockets in 2011 and > HTTP/2 any day now. > > I think its clear from the broad interest we've got that folk are > interested in a new spec. > > Whats in a name? > > We could call it something else. RSGI as you humourously suggest, or > we can call it WSGI. > > I think WSGI is the right name, because I don't think we want to aim > for a situation where folk writing new servers write both WSGI and > $NEWTHING support On the contrary, that is *precisely* what we want -- though only in the short run. Why? Because there is not currently any body of implementation experience with HTTP/2 APIs that can be drawn from to create a stable specification. To create a solid spec, one that doesn't lock prematurely into an API that *seems* usable but actually isn't, it's necessary to have some real API usage experience. Which, if I have understood you correctly in this thread, is essentially nonexistent. So an API specification at this point would be more like an API *speculation*. Ideas about what APIs might or might not be useful or attractive. So, rather than filling in $NEWTHING with a specific value, we want $NEWTHING to be a *wildcard*, consisting of whatever HTTP/2 APIs you and others can dream up -- or have already implemented. One of two things will then happen: either one particular $NEWTHING catches on to the point of being a clear winner (and therefore basis for a standard), *or* it will then be possible to deduce a common-denominator low-level protocol common to the various $NEWTHINGs, which will then become the official New Thing. Either way, de facto beats de jure, where standards are concerned. > So - I'm going to keep drafting this as WSGI2, unless there is > consensus here that the name should be different. So - I'm going to state plainly my strong opposition to calling any protocol WSGI that is not a straightforward evolution within the same request-response paradigm we have to date, because it's just asking for confusion. The state of things is already confused enough, without having two completely different paradigms assigned the same name. It's bad communications, bad marketing, and bad engineering. I would much rather see WSGI be ultimately replaced entirely by its "rich" (and hopefully cleaner) successor, following an appropriate transition period. And during that transition period, WSGI will serve as the "simple use cases" API, with escapes allowed to richer APIs, as a prelude to developing the successor spec. > Your adapter sketch there is a useful escape hatch approach. It may > have some use. However the downside is that its going to break on a > lot of middleware. Actually, it should be completely safe for middleware that doesn't touch responses (or consume the wsgi.input), and for middleware that does touch responses (e.g. to redirect to a login page) either hasn't invoked the wrapped app at all, or is choosing to replace the response entirely. So I don't actually see where it's going to break much, if at all. > Really, I think we're agreeing on 95% here, but I'm biasing for having > a majority of WSGI2 eventually, whereas you seem to be biasing for > having a majority of WSGI indefinitely. If you had, let's say three reasonably independent-of-each-other HTTP/2 Python server API implementations right now, then I'd say it'd be worthwhile starting on a RSGI spec. However, since you've pointed out only one, and noted it has paradigmatic limitations with respect to ongoing developments in HTTP/2, the attempt to develop a de jure specification seems not just premature, but extremely so. If you want a solid Web-SIG consensus on something, I suggest that a WSGI escape mechanism suitable for using both that HTTP/2 API and some websocket APIs would be a much better bet. I think that with the participation of enough server developers, we could nail down a way to let WSGI apps escape to *any* "native" server API, be it Twisted, HTTP/2, tulip, or whatever, and get it blessed as a standard WSGI extension pretty quickly. > The reason I want to bias for > the long term, is that it will be with us for a long while. We need to > make incremental deployment easy - and that may well mean tunnelling > some things. There's no need to tunnel if you can bypass -- which ironically enough, is what people already have started doing with websockets, and which some of your own proposals have been about doing as well. That is to say, you've actually convinced me that middleware bypass is a *good* thing, for those things that cannot be sanely shoehorned into the request/response paradigm. Therefore, ISTM that by far the easiest way to "make incremental deployment easy" is to provide a way to escape WSGI and revert to a server's native API, whatever that API may be. It cuts out the following steps: * Defining an API right away, based on insufficient usage experience * Translating from a server's native API, to the defined API * Translating *again*, from the defined API to current WSGI While enabling the possibility for having *competition* among API paradigms. It also means that framework developers need not concern themselves with the details of any new API, apart from possibly documenting any special things their users need to know about escaping to a native server API. > The role of the spec here though is to define the > protocol by which folk can write tunnellers *after* we get the thing > working. Perhaps thats exactly what you mean: decouple an > HTTP/2+Websockets+HTTP/1.x protocol from tunnelling new features > through WSGI for legacy deployments. If thats what you mean, then I > agree - and thats what I'm working on :) Yes and no. I'm saying that we should work immediately on providing an *escape vector* for WSGI apps to say, "I want to jump out of WSGI and instead use a richer API for handling this request"... ...so that you can begin *developing* the new thing in an environment where it's easy to experiment with different APIs. In essence, the app provides a response that says, "if you (the server) get this response, then invoke the code I passed you earlier, in order to begin the *real* response." If the server doesn't see such a response, it's treated as a normal WSGI response, and any passed callbacks are discarded. (Because it means it was from a discarded or modified sub-request.) > Ok, so implementor experience in the wild has taught us that this is a > bad idea ;). Yes -- *sequencing headers* is a bad idea. Allowing apps *control over what goes over the wire* was not. ;-) That being said, I don't have any issue with dicts being used for headers in non-WSGI APIs. > So that would/might address the breakiness but it wouldn't standardise > the upgraded protocol(s) - the network effect is where the value is: > folk can work around bugs on a case by case basis. IMO, standardizing the upgraded protocol(s) is *very* premature. "Rough consensus and running code" requires that there be more than one instance of running code, to be in consensus. ;-) > Yeah, that really looks like the way forward: define a safe way to >> escape WSGI from inside of it, so that server developers aren't forced >> to dumb down HTTP/2 to WSGI, in order to provide rich HTTP/2 APIs. >> What do you think? > > I worry that that leaves us with a lingua franca which we're expecting > everyone to escape from. Why would anybody who doesn't *need* HTTP/2 features *want* to escape it? > That doesn't seem like a great place to aim It's not an aim, it's a transitional path that allows us to start without boiling the ocean. Well, more of a lake than an ocean in this case, but I think you're really underestimating the amount of work needed to get a consensus HTTP/2 API put together, and if you base it on present-day WSGI I think you're underestimating how much pushback you'll get from existing implementors. ;-) From robertc at robertcollins.net Mon Sep 29 22:14:39 2014 From: robertc at robertcollins.net (Robert Collins) Date: Tue, 30 Sep 2014 09:14:39 +1300 Subject: [Web-SIG] Combine wsgi and asyncio - possible? In-Reply-To: <66125E21D652445B8489D72130207E01@frank> References: <66125E21D652445B8489D72130207E01@frank> Message-ID: On 30 September 2014 02:26, Frank Millman wrote: > Hi all > > I am developing a business/accounting application. It is not a web server in > the conventional sense, but it uses http, and clients connect to it via a > web browser. The server responds to an initial connection by sending a > block of javascript which uses on_load() to display a welcome page. After > that, all communication is handled by ajax-style messages passed between > server and client. At no point is a new page requested or reloaded. Pages are browser constructs :) - I presume you're still speaking HTTP/1.1, with each request and response JSON - so a typical HTTP API implementation? ... > Sometimes one user action generates more than one 'event'. I package these > up into a list and send them to the server in one message. As the server > processes the events, it can result in multiple responses to the client. > These are also packaged up and sent in one message. One orthogonal thought here - HTTP/2 and websockets are [differently but relatedly] aimed at solving this in perhaps a cleaner way. HTTP/2 allows you to very sanely (unlike pipelining which had lots of problems) send different messages without waiting for the response, on the same connection. So you don't need to package up the events, just submit them, and you get separate events on the server side, without latency overheads. [There is some encoding overhead of course, but on a TCP stream with its window opened, you shouldn't notice that!]. There are websocket APIs around today that e.g. uwsgi offer. > However, it can happen that while the server is working through the events, > it needs to send a message to the client to pop up a dialog box, ask a > question, and get a response before proceeding. With asyncio I can create a > Future to set up and send the message, and use 'yield from > asyncio.wait_for(...)' to wait for the response. Straight WSGI can in principle do this, but I suspect that browser APIs won't play nice. Here's how it would work: - make sure your ajax request is chunked, so that we can stream the request up. Don't close the request stream until you've received the full answer to all of your events. Stream the response back from your WSGI app by using yield rather than write() / returning a list. On the client, make sure you can handle multiple JSON documents within the one response without fully buffering it. You'll almost certainly need to be using SSL to avoid hitting a buffering intermediary (things ilke virus scanners etc). Assuming all the intermediaries are well behaved, nothing will buffer either the request or the response, and you'll have bidirectional communication happening. Doing it without browser API support is also possible, but it requires some mind bending - its basically what you're doing in asyncio: you reply from the first context, let the clients answer come back in on a new context, and then handoff from the new context to the existing old one to complete the data. I'm quite sure the asyncio code will be much cleaner for this situation. > I am using asyncio.start_server(). For each request, the handler is passed a > client_reader and a client_writer. Normally the writer is used to write the > response to the original request, but if I need to ask a question, I use the > writer to send the 'dialog box' message. When I get the response, I take the > new client_writer and pass it back to the original request handler for it to > complete the request. > > As I understand it, wsgi requires me to actually 'return' the response, so I > don't have the opportunity to call 'yield from', and I do not get access to > the writer object. > > Any suggestions welcome. HTH, Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From cmawebsite at gmail.com Mon Sep 29 23:14:20 2014 From: cmawebsite at gmail.com (Collin Anderson) Date: Mon, 29 Sep 2014 17:14:20 -0400 Subject: [Web-SIG] REMOTE_ADDR and proxys In-Reply-To: References: Message-ID: Thanks guys. So it sounds like it should be the responsibility of a middleware to re normalize the environment? On Wed, Sep 24, 2014 at 4:51 PM, Robert Collins wrote: > On 25 September 2014 07:16, Alan Kennedy wrote: > > [Collin] > >> It seems to me, it is the role of the server/gateway, not the > >> application/framework to determine the "correct" client ip address and > >> correctly account for the situation of being behind a known proxy. > > > > I disagreee. I think it is the role of the server/gateway to represent > the > > actual incoming HTTP request as accurately as possible. > > So I agree with you, but in a multi-tier deployment architecture: > > Client -> LB -> Front-end-cache -> HTTPd ->WSGI -> application, which > 'request' do app developers need represented? They want the client > request, which is 3 network hops away: its entirely reasonable (and > supported by RFC2616 and RFC7230 etc) for the internal structure of > such a deployment to extend things in such a way that normal > guarantees are suspended (e.g. caching, source addresses etc). > > > If the application knows about remote proxies and local reverse proxies, > > then it can take action accordingly. > > > > But the server should not attempt any magic: it is up to the application > to > > interpret the request in whatever way it sees fit. > ... > > If want to the magic rewriting functionality to be isolated from the > > application, then it could easily be implemented as middleware. > > So middleware is an application to the layer above and a server to the > layer below: how then is that not the server taking care of the > rewriting? Perhaps we're stuck on a definitional thing where by server > you are thinking only the code implied by e.g. serve_forever ? > > -Rob > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robertc at robertcollins.net Tue Sep 30 00:11:43 2014 From: robertc at robertcollins.net (Robert Collins) Date: Tue, 30 Sep 2014 11:11:43 +1300 Subject: [Web-SIG] WSGI2: write callable? In-Reply-To: References: Message-ID: It occurs to me that we're deep into one of Joey Hess's email-thread anti-patterns, so I'm going to leave this here for now. As I've said, I think the next step forward is to do some experimentation, which I'm sure the existing implementors that have expressed interest in this effort will join me in, and we'll get some indication together about how well [or otherwise] the basic things work. Concurrently, the IETF HTTP wg is now discussing websocket over HTTP/2, which will provide more data points for the API capabilities we'll need. -Rob -- Robert Collins Distinguished Technologist HP Converged Cloud From gdamjan at gmail.com Tue Sep 30 00:39:11 2014 From: gdamjan at gmail.com (Damjan Georgievski) Date: Tue, 30 Sep 2014 00:39:11 +0200 Subject: [Web-SIG] REMOTE_ADDR and proxys In-Reply-To: References: Message-ID: On 29 September 2014 23:14, Collin Anderson wrote: > Thanks guys. So it sounds like it should be the responsibility of a > middleware to re normalize the environment? I don't think it's always like that. The knowledge of the setup related to REMOTE_ADDR and trusted proxies is with the system admin, and he is the one setting up the WSGI server. So unless the developer implements and allows configuring the said middleware, sys-admins hands are tied to do anything about it. BTW, currently when I use nginx/uwsgi I have to configure my nginx what its trusted proxies are, and what headers (or else they use) and uwsgi luckily doesn't need additional setup when using the uwsgi protocol, it just gets the nginx REMOTE_ADDR info. -- damjan From alan at xhaus.com Tue Sep 30 00:47:34 2014 From: alan at xhaus.com (Alan Kennedy) Date: Mon, 29 Sep 2014 23:47:34 +0100 Subject: [Web-SIG] REMOTE_ADDR and proxys In-Reply-To: References: Message-ID: [Alan] >> I disagreee. I think it is the role of the server/gateway to represent the >> actual incoming HTTP request as accurately as possible. [Robert] > So I agree with you OK, so we agree :-) [Robert] > but in a multi-tier deployment architecture: Then why disagree? ;-) [Robert] > Client -> LB -> Front-end-cache -> HTTPd ->WSGI -> application, which > 'request' do app developers need represented? They want the client > request, which is 3 network hops away: its entirely reasonable (and > supported by RFC2616 and RFC7230 etc) for the internal structure of > such a deployment to extend things in such a way that normal > guarantees are suspended (e.g. caching, source addresses etc). So what do you include and what do you exclude? 1. It's quite possible that the client is behind som kind of egress proxy or firewall, which may or may not add a X-Forwarded-For header. Should this be included? 2. What if your frontend LB is not configured to set an X-Forwarded-For header? What if it is? What if there is differing configuration across multiple LBs that are in your ingress path, and you get conflicting results depending on what path the request came in? 3. What if there is a cache miss on your frontend cache? Will the caching proxy add a header? 4. What if the proxy added a non-standard X-Forwarded-Ip header? - If it does, can you do reverse DNS lookup to find the host that it reverses to? - If yes, in what DNS authority? 5. Is the order in which X-Forwarded-For headers guaranteed? Is it trustworthy? Will every proxy in the chain declare itself? - Answers: no, no, and no. Each of the above questions has multiple answers, each of which is arguably valid, depending on your point of view. The problem is that HTTP proxies are just too easy to write, and every author of a proxy will make slightly different decisions on what should be forwarded and what should not. Every configurable proxy can and will be configured differently, according to the requirements of the folks operating it. http://proxies.xhaus.com [Robert] > which 'request' do app developers need represented? The request that arrives into the origin server, exactly as it arrived, unmodified. That way they can apply their own heuristics to processing the request, knowing that it has not been interfered with. > They want the client request, which is 3 network hops away In your example, it's 3 hops away. I can easily paint you a thousand different scenarios, each of which is a different number of hops away. [Robert] > So it sounds like it should be the responsibility of a middleware to renormalize the environment? In order for that to be the case, you have strictly define what "normalization" means. I believe that it is not possible to fully specify "normalization", and that any attempt to do so is futile. If you want to attempt it for the specific scenarios that your particular application has to deal with, then by all means code your version of "normalization" into your application. Or write some middleware to do it. But trying to make "normalization" a part of a WSGI-style specification is impossible. Alan. On Mon, Sep 29, 2014 at 10:14 PM, Collin Anderson wrote: > Thanks guys. So it sounds like it should be the responsibility of a > middleware to re normalize the environment? > > On Wed, Sep 24, 2014 at 4:51 PM, Robert Collins > wrote: > >> On 25 September 2014 07:16, Alan Kennedy wrote: >> > [Collin] >> >> It seems to me, it is the role of the server/gateway, not the >> >> application/framework to determine the "correct" client ip address and >> >> correctly account for the situation of being behind a known proxy. >> > >> > I disagreee. I think it is the role of the server/gateway to represent >> the >> > actual incoming HTTP request as accurately as possible. >> >> So I agree with you, but in a multi-tier deployment architecture: >> >> Client -> LB -> Front-end-cache -> HTTPd ->WSGI -> application, which >> 'request' do app developers need represented? They want the client >> request, which is 3 network hops away: its entirely reasonable (and >> supported by RFC2616 and RFC7230 etc) for the internal structure of >> such a deployment to extend things in such a way that normal >> guarantees are suspended (e.g. caching, source addresses etc). >> >> > If the application knows about remote proxies and local reverse proxies, >> > then it can take action accordingly. >> > >> > But the server should not attempt any magic: it is up to the >> application to >> > interpret the request in whatever way it sees fit. >> ... >> > If want to the magic rewriting functionality to be isolated from the >> > application, then it could easily be implemented as middleware. >> >> So middleware is an application to the layer above and a server to the >> layer below: how then is that not the server taking care of the >> rewriting? Perhaps we're stuck on a definitional thing where by server >> you are thinking only the code implied by e.g. serve_forever ? >> >> -Rob >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pje at telecommunity.com Tue Sep 30 01:19:19 2014 From: pje at telecommunity.com (PJ Eby) Date: Mon, 29 Sep 2014 19:19:19 -0400 Subject: [Web-SIG] Pre-PEP: The WSGI Middleware Escape for Native Server APIs Message-ID: Per the previous discussion about HTTP/2, websockets, et al, here's my attempt at providing something we can start using and implementing today, as a bridge to future specifications. If you'd prefer to read it nicely formatted, you can find an HTML version in progress at: https://gist.github.com/pjeby/62e3892cd75257518eb0 I'm very interested in feedback from server and framework developers with relevant experience to help close the "open issues and questions" section. Questions about the content or feedback on its presentation would also be very helpful. (For now, the text is in markdown, but of course I will switch it to ReST once it begins stabilizing.) # The WSGI Middleware Escape for Native Server APIs # Overview This document specifies a proposed standard WSGI extension that allows WSGI applications to "escape" the standard WSGI API and access native web server APIs, such as websockets, HTTP/2 features, or Twisted/tulip-style asynchronous APIs. The proposed extension, the Middleware Escape for Nativer Server APIs or "MENSA", allows WSGI to continue to be used for the 98% of typical web application use cases that fall within the basic HTTP/1.0 "request/response" paradigm, while allowing the 2% of use cases with more sophisticated requirements to still benefit from "inbound" WSGI middleware for sessions, authentication, authorization, routing, and so forth, as well as keeping the other advantages of sharing the same process with other WSGI code. Specifically, the MENSA protocol allows a WSGI application to *dynamically* switch at runtime from using a standard WSGI response, to using a web server's "native" API to handle the current request (and possibly subsequent ones), subject to certain conditions. This approach provides present-day WSGI applications and frameworks with a smooth upward migration path in the event that they require access to websockets, HTTP/2-specific features, etc. With it: * Web servers can expose their native API to any WSGI application or framework * Application developers can use existing middleware, libraries, or frameworks to handle front-end tasks like routing and authentication * Frameworks can offer a simple `response.use_native_api(...)` (or similar) API to allow app developers to easily "jump out" of the framework and request the use of a specific native server API for the current request, and * Even developers using frameworks that *don't* offer this escape API can still use it, by invoking a short utility function given in this specification, and adding a little framework-specific glue code # Motivation Recent discussion on the Python Web-SIG about incorporating HTTP/2 features into present-day WSGI has highlighted the extreme difficulties of doing so without breaking certain types of middleware. In addition, it highlighted the strong existing need for Websockets in present-day web apps, and the ways in which existing Websocket extensions for WSGI have the same problems. Both HTTP/2 and Websockets are a fairly extreme break from the request/response paradigm of HTTP/1.0 that WSGI was designed around, making them difficult to represent within WSGI, and therefore a poor fit for a direct extension of the existing WSGI protocol. Such a direct extension would not only be premature for HTTP/2 (due to a lack of existing HTTP/2 APIs for Python), but would also be unnecessarily confining in terms of what features could be supported, and unnecessarily complex in how those features would need to be implemented. Therefore, this proposal seeks to defer or *table* ("mensa" is the Latin word for "table") the issue of creating an HTTP/2 WSGI extension API, by making it possible for existing WSGI applications to access *any* such API that existing web servers or server frameworks may wish to provide. (i.e. giving all of them "a seat at the table".) Thus, it would not be necessary to standardize on the One True Websocket API or One True HTTP/2 API at this time, because server authors can simply expose their native APIs for the use of those web applications that have need of such APIs. This neatly resolves two current issues in the community at present: 1. Often, the only way to mix websockets (or HTTP/2) and WSGI is through separate processes, often with the need to reinvent the wheel for routing and other functions commonly handled by WSGI front-end middleware 2. The "chicken and egg" problem of developing an HTTP/2 API spec when there are few such APIs existing in the field, but nobody wants to *implement* such APIs because nobody can use them from WSGI, and nobody wants to abandon WSGI to write their entire applications or frameworks based on a new and largely-untested API that's not yet blessed as a specification. In contrast, adoption of the WSGI MENSA spec allows both server developers and application developers to experiment with advanced server APIs, without throwing away their WSGI investments (or native server API investments!), and only making new investments in that portion of the application space that require access to more advanced APIs. That is, if the bulk of one's code is still in WSGI, it is still migratable to other server platforms, with only the advanced portions needing to be ported. Thus, the risk of tying one's application too tightly to one particular native API is considerably reduced. Thus, as community experience with advanced server APIs is increased, the practicality of actually defining a *standard* server API for these types of applications is also increased. Eventually, such a standard API could then perhaps even replace WSGI, while still being accessible from within legacy WSGI frameworks (via the MENSA escape). # Scope Goals of this specification include: 1. Defining a way for WSGI applications, at runtime (i.e., during the execution of a request), to detect the existence of, and access, "native sever APIs" which can be used in place of WSGI for either effecting a response to the current request, or initiating a more advanced communications protocol (such as websocket connections, associated content pushing, etc.) 2. Defining ways for WSGI middleware to: 1. Continue to be used for request routing and other pre-response activities for all requests, as well as post-response activities for requests that do not require native API access 2. Intercept and assume control of any native APIs to be used by wrapped applications or subrequests (assuming the middleware knows how to do this for a specific native API, and desires to do so) 3. Disable any or even *all* native API access by its wrapped apps -- even without prior knowledge of *which* APIs might be used -- in the event that the middleware can only perform its intended function by denying such access 3. Defining a way for WSGI servers to negotiate a smooth transition of response handling between standard WSGI and their native API, while safely detecting whether intervening middleware has taken over or altered the response in a way that conflicts with elevating the current request to native API processing Non-goals include: * Actually defining any specification for the native APIs themselves ;-) # Specification The basic idea of MENSA is to add a dictionary to the WSGI environment, under the key `wsgi.native_api_hooks`. Within this dictionary, a single key is reserved for each non-WSGI API offered by the server (or implemented via middleware). So, for example, if Twisted were to offer a MENSA escape for WSGI apps, it might register a `twisted` key within the `wsgi.native_api_hooks` dictionary. ## Accessing a Native API WSGI applications query the `wsgi.native_api_hooks` dictionary in order to access the native API of their choice, and then delegate to it. So, for example, a pure WSGI app that switches to the `foobar` native API mid-request might look like this: def my_wsgi_app(environ, start_response): native_apis = environ.get('wsgi.native_api_hooks', {}) foobar_api = native_apis.get('foobar') if foobar_api is None: # appropriate error action here # i.e. raise something, or return an error response def my_foobar_app(foobar_specific_arg, another_foobar_arg, etc...): # code here that uses the foobar API to do something cool, # like maybe websockets or signed streaming trailers or # other buzzword-laden stuff ;-) # Delegate the WSGI response to the native API return foobar_api(environ, start_response, my_foobar_app) On the application side, this is all that's necessary for a pure-WSGI application to switch to using a native server API and whatever its advanced features permit. (For applications using frameworks that don't directly expose the WSGI start_response() or allow returning a WSGI response body directly, a little extra glue code is required; those details are covered in a later section of the spec.) In the above example, `my_foobar_app` is a function, but depending on the specific API involved, it could be a class or an instance of some kind, or perhaps just a data structure of some sort. The nature of the "app" or other parameters passed to the API hook is completely dependent on the design of the API being wrapped: only the first two arguments to the hook are dictated by this specification. So, for example, a Twisted native API might expect a `Protocol` instance, rather than a function. A gevent-based native API might expect a generator, generator function, or perhaps a greenlet. A websocket API might take *two* parameters, for a writer and reader. Defining and documenting the exact nature of the additional parameters passed to the API hook is entirely up to the hook's provider. ## Providing an API The implementation of a native API hook consists of a callable object, looking something like this pseudocode: def some_server_api_hook(environ, start_response, native_app): response_key = new_unique_header_compatible_string() native_request.response_registry[response_key] = native_app start_response('399 WSGI-Escape: '+response_key, [ ('Content-Type', 'application/x-wsgi-escape; id='+response_key), ('Content-Length', str(len(response_key))) ]) return [response_key] As you can see, this is a little bit like a WSGI application -- and in fact it *is* a valid WSGI application, except for the addition of the `native_app` parameter. The API hook's job is to generate a unique ASCII "native string" key for this response, and register the provided native app (or other arguments) under that key for *future use*. The server MUST NOT actually invoke or begin using the native application until *after* the standard WSGI response process has been completed, and it has verified that its markers are still present in the WSGI response. Those markers -- found in the status, headers, *and* response body -- are used to verify three things: 1. That the registered application is indeed a response to the original incoming request, and not merely to a subrequest created by middleware 2. That intervening middleware hasn't replaced the native API response with a response of its own (for example, an error response created because of an error occurring after the native app was registered, but before it was used) 3. *Which* native application should be invoked, if more than one was registered So, a server providing a native API must wait until it receives a WSGI response whose status, content-type, content-length, and body all unequivocally identify which of the native applications registered for the current request should actually be used. In the event that the status, type, and body all match, the server MUST then activate the registered native application, allowing the current request (and possibly subsequent requests, depending on the API involved) to be handled via the associated native API. (And discard any other registered applications for the current request.) In the event that neither the status nor headers designate a registered native application, the server MUST treat the response as a standard WSGI response, and discard all registered applications for the current request. In the event that the status and headers disagree on *which* native application is to be used (or *whether* one is to be used at all), or in the event that they *do* agree, but the body disagrees with them, the server MUST generate an error response, and discard both the WSGI response and any registered native applications. (In the face of ambiguity, refuse the temptation to guess; errors should not pass silently.) ### Response Key Details The key used to distinguish responses MUST be an ASCII "native string" (as defined by PEP 3333). It SHOULD also be relatively short, and MUST contain only those characters that are valid in a MIME "token". (That is, it may contain any non-space, non-control ASCII character, except the special characters `(`, `)`, `<`, `>`, `@`, `,`, `;`, `:`, `\`, `"`, `/`, `[`, `]`, `?`, and `=`.) Response keys generated for a given API MUST be unique for the duration of a given request, and MUST be generated in such a way so as not to collide with keys issued for any *other* API during the same request. (e.g., by including the API's name in them.) Response keys SHOULD also be unique within the lifetime of the process that generates them, e.g. by simply including a global counter value. (So, the simplest valid way of generating a response key is to just append a global counter to a string identifying the native API. However, there is nothing stopping a server from adding information like a request ID, channel desginator, or other information in, as an aid to debugging. Just make sure there's no whitespace or special characters involved, as mentioned above.) ## Intercepting or Disabling APIs Because all server API hooks are contained in a single WSGI environment key, it is easy for WSGI middleware to disable access to them when creating subrequests, by simply deleting that key before invoking an application. Likewise, in the event that WSGI middleware wishes to disable one *specific* API, or intercept it, it can do so by removing or replacing the appropriate hook within the hooks dictionary. (Note: The `wsgi.native_api_hooks` dictionary is to be considered volatile in the same way as the WSGI environment is. That is, apps or middleware are allowed to modify or delete its contents freely, so a copy MUST be saved by middleware if it wishes to access the original values after it has been passed to another application or middleware.) ## Accessing Native APIs Inside Application Frameworks Since relatively few applications are written in "pure WSGI", it's necessary to show how one would go about accessing a native API from inside an application framework that doesn't provide direct access to the WSGI `start_response`, or allow directly returning a response body. Here is a simple, but fully-generic utility function that works around this problem, provided there is at least access to the WSGI environment: def use_native_api(environ, api_key, *args, **kw): native_api = environ.get('wsgi.native_api_hooks', {}).get(api_key) if native_api is None: raise RuntimeError("API unavailable") status = headers = None def start_response(s, h): nonlocal status, headers status, headers = s, h return status, headers, native_api(environ, start_response, *args, **kw) The returned status, headers, and body can then be sent using framework-specific APIs, so that they propagate back out through the WSGI stack. (Individual web frameworks, of course, can and *should* offer their own, similar utilities to perform this function, e.g. by adding a `use_native_api()` method on their response objects. In that way, developers can be spared the details of setting the status, headers, etc.) # Notes on Current Design Rationale * A dictionary is used for all native APIs, so they can be easily disabled for subrequests * Multiple registrations are allowed, so that middleware invoking multiple subrequests is unaffected, so long as exactly one subrequest's response is returned * A `Content-Type` header is part of the spec, because most response-altering middleware should avoid altering content types it does not understand, thereby increasing the likelihood that the response will be passed through unchanged # Open Questions and Issues * What if middleware adds headers but leaves the status and content-type unchanged? Should that be an error? What happens if middleware requests setting cookies? * Do the chosen status/headers/body signatures actually make sense? Do they even need to be more specified, less-specified? * Are there any major obstacles to sending a special status from major web frameworks? * Should a different status be used? * We need better examples! (They should more closely resemble some actual use cases, rather than being vague abstracts) * Are there any other ways to corrupt, confuse, or break this? * What else am I missing, overlooking, or getting wrong? # Acknowledgements (TBD, but should definitely include Robert Collins for research, inspiration, and use cases) # References TBD # Copyright This document has been placed in the public domain. From roberto at unbit.it Tue Sep 30 09:41:31 2014 From: roberto at unbit.it (Roberto De Ioris) Date: Tue, 30 Sep 2014 09:41:31 +0200 Subject: [Web-SIG] Pre-PEP: The WSGI Middleware Escape for Native Server APIs Message-ID: <900eed38b03992f3a2ce1c9090857a7b.squirrel@manage.unbit.it> > Per the previous discussion about HTTP/2, websockets, et al, here's my attempt at providing something we can start using and implementing today, as a bridge to future specifications. If you'd prefer to read it nicely formatted, you can find an HTML version in progress at: > > https://gist.github.com/pjeby/62e3892cd75257518eb0 > > I'm very interested in feedback from server and framework developers with relevant experience to help close the "open issues and questions" section. Questions about the content or feedback on its presentation would also be very helpful. > > While i totally like your proposal, i fear it will not solve one of the biggest problems without another layer: currently (and i speak as the uWSGI author, so i am the first guilty here) when you want to use non-WSGI features you generally call into server api (like the one exposed in the 'uwsgi' virtual module). This means each server has its api, and this result as middlewares and apps to be adapted to each one (if possible) My proposal is to push "mensa" but to standardize a series of api (websockets and push at least) on top of it so that frameworks and middlewares can use them without worrying about the lower stack. -- Roberto De Ioris http://unbit.it From cory at lukasa.co.uk Tue Sep 30 10:09:26 2014 From: cory at lukasa.co.uk (Cory Benfield) Date: Tue, 30 Sep 2014 09:09:26 +0100 Subject: [Web-SIG] Pre-PEP: The WSGI Middleware Escape for Native Server APIs In-Reply-To: <900eed38b03992f3a2ce1c9090857a7b.squirrel@manage.unbit.it> References: <900eed38b03992f3a2ce1c9090857a7b.squirrel@manage.unbit.it> Message-ID: On 30 September 2014 08:41, Roberto De Ioris wrote: > While i totally like your proposal, i fear it will not solve one of the > biggest problems without another layer: > > currently (and i speak as the uWSGI author, so i am the first guilty here) > when you want to use non-WSGI features you generally call into server api > (like the one exposed in the 'uwsgi' virtual module). This means each > server has its api, and this result as middlewares and apps to be adapted > to each one (if possible) > > My proposal is to push "mensa" but to standardize a series of api > (websockets and push at least) on top of it so that frameworks and > middlewares can use them without worrying about the lower stack. This was exactly the concern I was about to articulate. Having a standard way to 'escape' WSGI is great, but what it does is force us down a road where any application that wants to use HTTP/2 or WebSockets picks one server at the start of its life and is effectively tied to that server. Any application small enough to be easily ported is also small enough that it isn't a reasonable test of the API. Any application large enough to really provide insight into the APIs is also large enough that it will rapidly become tightly coupled to its server implementation. Additionally, it's a cost for server authors (unless they think they really do have the ability to provide the 'best' API around which all of us will rally). Server authors are going to have enough work just making their servers speak HTTP/2 out the front, asking them to also invest work in designing an API that *might* get used by a small fraction of applications is really a big ask. Finally, the odds of us getting buy-in from frameworks is surely not very high. What interest will, for example, Armin Ronacher have in having support for uWSGI's specific HTTP/2 API in werkzeug/Flask? What about gunicorn's? Or mod_wsgi's? I appreciate the argument for wanting to let server+middleware authors develop the APIs themselves and then standardise around it, I really do. But without a concrete plan of who is going to make the first investment, I think it just leaves us sitting around doing nothing. A better approach would be to say, as Roberto suggested, "hey, here's this generic WSGI escape mechanism, and here are some generic HTTP/2 and Websocket APIs you can escape to". We could even version those APIs, allowing for communal development of them between server authors. That provides the initial escape hook and an initial direction, reducing the risk for individual server authors. From sh at defuze.org Tue Sep 30 11:03:25 2014 From: sh at defuze.org (Sylvain Hellegouarch) Date: Tue, 30 Sep 2014 11:03:25 +0200 Subject: [Web-SIG] Pre-PEP: The WSGI Middleware Escape for Native Server APIs In-Reply-To: References: Message-ID: Hi, 2014-09-30 1:19 GMT+02:00 PJ Eby : > Per the previous discussion about HTTP/2, websockets, et al, here's my > attempt at providing something we can start using and implementing > today, as a bridge to future specifications. If you'd prefer to read > it nicely formatted, you can find an HTML version in progress at: > > https://gist.github.com/pjeby/62e3892cd75257518eb0 > > I'm very interested in feedback from server and framework developers > with relevant experience to help close the "open issues and questions" > section. Questions about the content or feedback on its presentation > would also be very helpful. > > (For now, the text is in markdown, but of course I will switch it to > ReST once it begins stabilizing.) > > > # The WSGI Middleware Escape for Native Server APIs > > # Overview > > This document specifies a proposed standard WSGI extension that allows > WSGI applications to "escape" the standard WSGI API and access native > web server APIs, such as websockets, HTTP/2 features, or > Twisted/tulip-style asynchronous APIs. > > The proposed extension, the Middleware Escape for Nativer Server APIs > or "MENSA", allows WSGI to continue to be used for the 98% of typical > web application use cases that fall within the basic HTTP/1.0 > "request/response" paradigm, while allowing the 2% of use cases with > more sophisticated requirements to still benefit from "inbound" WSGI > middleware for sessions, authentication, authorization, routing, and > so forth, as well as keeping the other advantages of sharing the same > process with other WSGI code. > > Specifically, the MENSA protocol allows a WSGI application to > *dynamically* switch at runtime from using a standard WSGI response, > to using a web server's "native" API to handle the current request > (and possibly subsequent ones), subject to certain conditions. > > This approach provides present-day WSGI applications and frameworks > with a smooth upward migration path in the event that they require > access to websockets, HTTP/2-specific features, etc. With it: > > * Web servers can expose their native API to any WSGI application or > framework > It's kind of already the case with all the existing servers. They all perform the stream reading and HTTP parsing in their own native way and then adapt those to WSGI. Basically, all existing Python HTTP servers do this already. For some servers, you can even bypass the WSGI mapping altogether if you know you're only stay in the framework native-land. > > * Application developers can use existing middleware, libraries, or > frameworks to handle front-end tasks like routing and authentication > > Shouldn't we drop the middleware idea altogether? Maybe I'm being bold but when I look at some popular frameworks, many seem to escape WSGI itself already. They take a WSGI context and transform it to their native context. Then back again from their native response to the WSGI context (with a performance penalty in the process): * Django https://github.com/django/django/blob/master/django/core/handlers/wsgi.py#L83 * CherryPy https://bitbucket.org/cherrypy/cherrypy/src/4939ea9fbe6b0def376fbb349c039c38b0a8994b/cherrypy/_cpwsgi.py?at=default#cl-223 * web2py https://github.com/web2py/web2py/blob/master/gluon/globals.py#L154 Others stick closer to WSGI all the way but still hide it more or less: * Flask/Werkzeug provide properties above the environ dictionary itself https://github.com/mitsuhiko/werkzeug/blob/master/werkzeug/wrappers.py * Bottle has its own wrapper as well: https://github.com/defnull/bottle/blob/master/bottle.py#L995 * Pyramid is based on WebOb Servers seem happy enough to expose HTTP through WSGI for convenience and compatibility but, frameworks use native objects and workflow and forget about WSGI altogether. I seldom see them expose the environ or start_response at a high level. Those details are kept hidden to respect (sometimes brokenly) the WSGI protocol. -- - Sylvain http://www.defuze.org http://twitter.com/lawouach -------------- next part -------------- An HTML attachment was scrubbed... URL: From pje at telecommunity.com Tue Sep 30 16:58:39 2014 From: pje at telecommunity.com (PJ Eby) Date: Tue, 30 Sep 2014 10:58:39 -0400 Subject: [Web-SIG] Pre-PEP: The WSGI Middleware Escape for Native Server APIs In-Reply-To: <900eed38b03992f3a2ce1c9090857a7b.squirrel@manage.unbit.it> References: <900eed38b03992f3a2ce1c9090857a7b.squirrel@manage.unbit.it> Message-ID: On Tue, Sep 30, 2014 at 3:40 AM, Roberto De Ioris wrote: > While i totally like your proposal, i fear it will not solve one of the > biggest problems without another layer: Of course. An escape to a native API isn't much use without the native APIs. ;-) > currently (and i speak as the uWSGI author, so i am the first guilty here) > when you want to use non-WSGI features you generally call into server api > (like the one exposed in the 'uwsgi' virtual module). This means each > server has its api, and this result as middlewares and apps to be adapted > to each one (if possible) Actually, no. The entire point of this escape is so that only *part* of an app needs to be adapted, and not any of the middleware or frameworks sitting on top. With this proposal, only the portion of an application that uses websockets would dynamically switch to invoking your API. That is, this proposal handles the case where somebody writes a huge Django, Flask, Pyramid or even Zope application and suddenly realizes they need websockets for a new feature, but that new feature *also* needs access to a session and various application objects loaded by their framework's routing system. As long as their server has *some* websocket API, they can then write an escape to access that API within the structure of their existing framwork, which itself does not need to be modified. > My proposal is to push "mensa" but to standardize a series of api > (websockets and push at least) on top of it so that frameworks and > middlewares can use them without worrying about the lower stack. The idea here is that the mensa protocol allows such APIs to be developed and used *without* existing frameworks needing to adopt them or interact with them directly. An application developer uses mensa to *bypass* the framework and middleware, for some portion of their application where it is advantageous. Certainly, having standard APIs to escape *to* is a good idea, but not necessary in order for the mensa protocol to be immediately useful. If, for example, you provided a wsgi.native_api_hook for uWSGI, then people could *safely* use your websocket API from inside their existing apps based on WSGI frameworks, without needing a separate process or routing mount point that cannot access the other parts of their application (e.g. sessions, authentication, routing, loading objects, etc.) From pje at telecommunity.com Tue Sep 30 17:27:12 2014 From: pje at telecommunity.com (PJ Eby) Date: Tue, 30 Sep 2014 11:27:12 -0400 Subject: [Web-SIG] Pre-PEP: The WSGI Middleware Escape for Native Server APIs In-Reply-To: References: <900eed38b03992f3a2ce1c9090857a7b.squirrel@manage.unbit.it> Message-ID: On Tue, Sep 30, 2014 at 4:09 AM, Cory Benfield wrote: > This was exactly the concern I was about to articulate. Having a > standard way to 'escape' WSGI is great, but what it does is force us > down a road where any application that wants to use HTTP/2 or > WebSockets picks one server at the start of its life and is > effectively tied to that server. > > Any application small enough to be easily ported is also small enough > that it isn't a reasonable test of the API. Any application large > enough to really provide insight into the APIs is also large enough > that it will rapidly become tightly coupled to its server > implementation. I'm not sure I see that, but perhaps I'm missing something. The assumption I'm making is that for most applications, most of their code is going to be tightly coupled to whatever web framework they are using. Even the part of their app that uses websockets or HTTP/2 features, is *mostly* going to be dealing with application concerns, not websocket or HTTP/2 concerns. There should only be a very small amount of application surface that is either calling or called by the native API. IOW, I don't see how the native API-coupled part of an application is ever going to be a significant body of code, compared to the part that would be the same if ported to a different API. Most of the API-coupled bits are going to just be ways to say "send this" or "call me when you receive this", or the like. But again, maybe I am missing something? (In any case, the existence of an escape protocol also allows independent development of standardized native APIs, e.g. one for websockets and another for HTTP/2, without requiring them to all be developed at once.) > Additionally, it's a cost for server authors (unless they think they > really do have the ability to provide the 'best' API around which all > of us will rally). Server authors are going to have enough work just > making their servers speak HTTP/2 out the front, asking them to also > invest work in designing an API that *might* get used by a small > fraction of applications is really a big ask. Currently, some web servers already *have* websocket and HTTP/2 APIs. But you have to "go native" to use them right now, and you can't take advantage of your framework parts if you have a request that's entirely handled by a native API. The idea here is to bridge *that* gap. > Finally, the odds of us getting buy-in from frameworks is surely not > very high. What interest will, for example, Armin Ronacher have in > having support for uWSGI's specific HTTP/2 API in werkzeug/Flask? What > about gunicorn's? Or mod_wsgi's? He doesn't have to. The point of the suggested `response.use_native_api()` is that it's *generic*: the *user* invokes `use_native_api('uWSGI', my_uwsgi_handler)` or `use_native_api('gunicorn', my_gunicorn_handler)`. In other words, the frramework doesn't care *what* API the app is escaping to. And, as long as the framework has a way to specify its outgoing status, headers, and body, it isn't even necessary for the framework to provide the `use_native_api()` escape -- it's just a little more work for the user. The main idea here is that the user can still use the framework up until the point where it has loaded all their session, user, etc. data, routed, preconditioned, and everything else... and then the user can pop back out to a low-level API to do things the framework can't yet do. > I appreciate the argument for wanting to let server+middleware authors > develop the APIs themselves and then standardise around it, I really > do. But without a concrete plan of who is going to make the first > investment, I think it just leaves us sitting around doing nothing. My suggestion would be for severs that already have a non-WSGI API to implement a mensa escape for it, so that people can begin using it at once. I would also suggest that framework authors provide a way to invoke a mensa escape from their framework, in order to be able to use those existing APIs today. (And since many frameworks either use WebOb internally or expose the WebOb response to their users, a WebOb implementation of a use_native_api() response would immediately enable a lot of experimentation.) I do not suggest that anybody make up *new* native APIs, until we have a chance to see what can be done with existing ones when you have the ability to integrate those APIs with existing WSGI-based apps and frameworks. But that doesn't mean it can't happen. > A better approach would be to say, as Roberto suggested, "hey, here's > this generic WSGI escape mechanism, and here are some generic HTTP/2 > and Websocket APIs you can escape to". We could even version those > APIs, allowing for communal development of them between server > authors. That provides the initial escape hook and an initial > direction, reducing the risk for individual server authors. That can certainly be done, but I don't personally have the expertise to work on those other APIs directly at the moment. But since I *do* have some expertise in WSGI and its various quirks and corners, I *can* provide assistance with the escape mechanism. ;-) I guess maybe the bit about "tabling" this part was a little too tongue-in-cheek with the "mensa" pun. I do not actually mean that nobody should do any common API development. I only mean that, it should not be necessary for anyone right now to *wait* on a standard before being able to use the APIs that already exist today. As soon as anybody with a server exposes a native API hook, it becomes *possible* to do the sort of versioning and collaborating you're talking about, *without* having to include the escape mechanism or other WSGI plumbing details into the design of something that really isn't WSGI-as-we-know-it. I just want the mensa protocol to be the booster rocket to your satellite: something that gets you off the ground, while you guys worry about what to do once you reach orbit. ;-) From frank at chagford.com Tue Sep 30 17:28:37 2014 From: frank at chagford.com (Frank Millman) Date: Tue, 30 Sep 2014 17:28:37 +0200 Subject: [Web-SIG] Combine wsgi and asyncio - possible? References: <66125E21D652445B8489D72130207E01@frank> Message-ID: <9B2C12950B27419E8034ADE302AD26C2@frank> From: "Robert Collins" To: "Frank Millman" Cc: "Web SIG" Sent: Monday, September 29, 2014 10:14 PM Subject: Re: [Web-SIG] Combine wsgi and asyncio - possible? Thanks very much, Rob - lots of useful info and valuable food for thought there. > On 30 September 2014 02:26, Frank Millman wrote: >> Hi all >> >> I am developing a business/accounting application. It is not a web server >> in >> the conventional sense, but it uses http, and clients connect to it via a >> web browser. The server responds to an initial connection by sending a >> block of javascript which uses on_load() to display a welcome page. After >> that, all communication is handled by ajax-style messages passed between >> server and client. At no point is a new page requested or reloaded. > > Pages are browser constructs :) - I presume you're still speaking > HTTP/1.1, with each request and response JSON - so a typical HTTP API > implementation? > ... Yes > > One orthogonal thought here - HTTP/2 and websockets are [differently > but relatedly] aimed at solving this in perhaps a cleaner way. > I knew about web sockets, but not HTTP/2, so thanks for the pointer. I had a look on Wikipedia and it seems to be the future, so I feel that I should not try too hard for a perfect solution now, but just get something working and keep an eye on developments. Having done a lot more browsing/reading, I am leaning towards figuring out how to use flup/FastCGI. If I can get that working fairly quickly, it will allow sysadmins to deploy my app using standard tools, and I can get back to my main priority - the actual accountiing software. Let me know if you think that is a bad idea! Frank From pje at telecommunity.com Tue Sep 30 17:38:20 2014 From: pje at telecommunity.com (PJ Eby) Date: Tue, 30 Sep 2014 11:38:20 -0400 Subject: [Web-SIG] Pre-PEP: The WSGI Middleware Escape for Native Server APIs In-Reply-To: References: Message-ID: On Tue, Sep 30, 2014 at 5:03 AM, Sylvain Hellegouarch wrote: > Hi, > > 2014-09-30 1:19 GMT+02:00 PJ Eby : >> * Web servers can expose their native API to any WSGI application or >> framework > > It's kind of already the case with all the existing servers. They all > perform the stream reading and HTTP parsing in their own native way and then > adapt those to WSGI. > Basically, all existing Python HTTP servers do this already. For some > servers, you can even bypass the WSGI mapping altogether if you know you're > only stay in the framework native-land. Right: for the most part, it's an all-or-nothing choice, at least for a given mount point within the server. Either it's a WSGI app or a native app. The point of the escape is that you can still use your WSGI framework, with all its routing, session handling, authentication, application object loading, etc., right up to the point where you need to do something advanced. The mensa protocol allows you to jump out of WSGI and back to the native API at that point. > Shouldn't we drop the middleware idea altogether? For the PEP's purposes, the definition of "middleware" includes web frameworks themselves. That is, it allows you to use a framework's features in tandem with native server APIs, without actually touching WSGI directly in application code. > Servers seem happy enough to expose HTTP through WSGI for convenience and > compatibility but, frameworks use native objects and workflow and forget > about WSGI altogether. I seldom see them expose the environ or > start_response at a high level. Those details are kept hidden to respect > (sometimes brokenly) the WSGI protocol. And the idea of this proposal is that as long as a framework exposes a *single* `use_native_api()` feature in some way (or exposes enough WSGI for a user to use the trick shown in the PEP), then the user can continue to ignore WSGI, but can *also* access low-level, non-WSGI APIs, to do things that are fundamentally incompatible with WSGI. (Also, as the PEP mentions, the escape protocol only requires that the framework allow access to the environ, and provide a way to set the status, headers, and content of the framework's eventual response. Any non-trivial framework provides *some* form of these latter three functions; it's only a question of whether the environ (or a copy of it) is available to get hold of the `wsgi.native_api_hooks`.)