From andrew at aeracode.org  Wed Mar  9 19:34:40 2016
From: andrew at aeracode.org (Andrew Godwin)
Date: Wed, 9 Mar 2016 16:34:40 -0800
Subject: [Web-SIG] Inviting feedback on my proposed "ASGI" spec
Message-ID: <CAFwN1uoAC0ROPhz4EX_+uMeQCyUvgG7-qpAFUO7C-kimS9ZtNg@mail.gmail.com>

Hi all,

As some of you may know, I've been working over the past few months to
bring native WebSocket support to Django, via a project codenamed "Django
Channels" - this is mostly the reason I've been involved in recent WSGI
discussions.

I'm personally of the opinion that WSGI works well for HTTP, with a few
improvements we can roll into a 1.1, but that we also need something else
that can support WebSockets and other future web protocols (e.g. WebRTC
components).

To that end, I did some work to make the underlying mechanism Django
Channels uses into more of a standard, which I have codenamed ASGI; while
initially I intended for it to be a Django documented API, as I've gone
further with the project I've come to believe it could be useful to the
Python community at large.

My intention would be for this spec to sit alongside WSGI, and be a second
option for both servers and frameworks to support (if they wished) that
supports both HTTP and WebSocket connections, as well as a reasonable way
to extend it to future protocols.

All current applications and servers could continue to work via adapter
classes that transform ASGI to WSGI on either end of its HTTP path, which I
think is an important migration consideration.

I'd love some feedback from this group on my proposed specification, and
any major problems you forsee; there are a few issues I know about, mostly
potential performance issues, but in most of those cases I believe the
gains outweigh the loss. The major change is that servers and applications
now run independently, either in separate threads or processes, and
communicate bidirectionally over a "channel layer", rather than the server
calling the application directly.

I'm not yet angling to take this to a PEP, but that would be my eventual
goal; right now, I want to get feedback from people on their major
likes/dislikes, and how it works for various parts of the Python web
ecosystem.

The spec already has an application framework (Django), web/websocket
server (Daphne [1]) and three channel layers [2] implemented, so I've
ironed out some major problems it initially had from working on those, but
I'm not as experienced in the rigours of serving HTTP as most of you are. I
do encourage you, though, to take a look at the rest of the Channels docs
if you want to get an idea of how it works and deploys in practice.

Spec is up here: http://channels.readthedocs.org/en/latest/asgi.html

Helpful quick Q&A: http://channels.readthedocs.org/en/latest/inshort.html

I do believe that making a clean break from WSGI to a new structure (and
NOT calling it "WSGI 2") is the best thing we can do if we truly want to
support more web protocols properly, and I believe that doing that in a way
that still supports WSGI and provides a nice migration path is important -
I believe ASGI provides both of these things, as well as a relatively
simple core API (one of WSGI's strengths in my opinion) - but I welcome
your opinions as well.

Andrew

[1] https://github.com/andrewgodwin/daphne
[2] https://github.com/andrewgodwin/asgi_redis,
https://github.com/andrewgodwin/asgiref/blob/master/asgiref/inmemory.py,
https://github.com/andrewgodwin/channels/blob/master/channels/database_layer.py
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/web-sig/attachments/20160309/58f99a46/attachment.html>

From cory at lukasa.co.uk  Thu Mar 10 04:59:15 2016
From: cory at lukasa.co.uk (Cory Benfield)
Date: Thu, 10 Mar 2016 09:59:15 +0000
Subject: [Web-SIG] Inviting feedback on my proposed "ASGI" spec
In-Reply-To: <CAFwN1uoAC0ROPhz4EX_+uMeQCyUvgG7-qpAFUO7C-kimS9ZtNg@mail.gmail.com>
References: <CAFwN1uoAC0ROPhz4EX_+uMeQCyUvgG7-qpAFUO7C-kimS9ZtNg@mail.gmail.com>
Message-ID: <D681E87B-4CD6-437D-8E56-CBFF90AB8231@lukasa.co.uk>


> On 10 Mar 2016, at 00:34, Andrew Godwin <andrew at aeracode.org> wrote:
> 
> To that end, I did some work to make the underlying mechanism Django Channels uses into more of a standard, which I have codenamed ASGI; while initially I intended for it to be a Django documented API, as I've gone further with the project I've come to believe it could be useful to the Python community at large.
> 

Andrew,

Thanks for this work! I?ve provided some proposed changes as pull requests against the channels repository. I?ll ignore those for the rest of the email: we can discuss them on GitHub.

I also have a few more general notes. I didn?t make PRs for these, mostly because they?re too ?vague? as feedback goes to be concretely handled by me.

First, your HTTP section has request headers serialized to a dict and response headers serialized to a list of tuples. I?m not sure how I feel about that asymmetry: it might be cleaner just to use lists-of-tuples in both places and allow application frameworks to handle translation to dictionary if they require it.

Second, if it were me I?d remove the `status_text` field on the `Response` object. Custom status text is a terrible misfeature (especially as HTTP/2 doesn?t support it), and in 99% of cases you?re just wasting data by repeatedly sending the default phrase that the server already knows.

Third, you?re currently sending header fields with unicode names and byte string values. That?s understandable, but I wonder if it?s worthwhile trying to limit the behaviour of compliant servers in encoding/decoding those header fields. For example, you could assert that the unicode header names will always use the Latin-1 codec when encoding/decoding. This is mostly me being paranoid about poorly written apps/servers issuing bad bytes onto the network. I should note that RFC 7230 strictly limits header names to US-ASCII, but Latin-1 would be the defensive choice against already-badly-written apps.

Your section on server push is great, whoever wrote that is clearly a genius. ;)

You define web socket data frames with an incrementing counter from zero, but also note that the maximum integer size is Python?s sys.maxint (you actually aren?t that clear about it, which might be a good idea). While this is *probably* not a problem, you may want to note that really long running or active web socket connections are at risk of exhausting the ?order? counter, and define a behaviour if that happens.

Otherwise, this is an interesting specification. I?m certainly open to helping push it through the PEP process if you?d like assistance with that.

Cory
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/web-sig/attachments/20160310/616e61ea/attachment.sig>

From ionel.mc at gmail.com  Thu Mar 10 09:46:39 2016
From: ionel.mc at gmail.com (Ionel Maries Cristian)
Date: Thu, 10 Mar 2016 16:46:39 +0200
Subject: [Web-SIG] Inviting feedback on my proposed "ASGI" spec
In-Reply-To: <CAFwN1uoAC0ROPhz4EX_+uMeQCyUvgG7-qpAFUO7C-kimS9ZtNg@mail.gmail.com>
References: <CAFwN1uoAC0ROPhz4EX_+uMeQCyUvgG7-qpAFUO7C-kimS9ZtNg@mail.gmail.com>
Message-ID: <CANkHFr9emQ9ZKrb=QV50Vs6osdeF9_HF4Kg=9A09773Z_mM+3g@mail.gmail.com>

Hey,

On Thu, Mar 10, 2016 at 2:34 AM, Andrew Godwin <andrew at aeracode.org> wrote:

> Helpful quick Q&A: http://channels.readthedocs.org/en/latest/inshort.html
>

I have looked over that and it's not very clear what goes where. [1] I'd be
inclined to understand that the process type "that handles HTTP and
WebSockets" would be some sort of specialized proxy service that does the
websocket routing, proxying plain requests to the worker (for the regular
views) and specific frontend protocol handling (upgrading to
http2.0/websockets or whatever).
It would be more clear if the docs would include some diagrams illustrating
data flow and how all the components connect together with what protocols.

Shouldn't the process type "that handles HTTP and WebSockets" have a more
specific term? It's a bit long to type.

Spec is up here: http://channels.readthedocs.org/en/latest/asgi.html
>

Is ASGI a wire protocol? I'd assume it is, if multiple processes
communicate to each other with this protocol, but the docs don't have any
details about the exact wire format.

Also, some comparison to existing solutions (like Meteor/SockJS/Crossbar
<http://crossbar.io/>/WAMP <http://wamp-proto.org/>) would help clearing
lots of questions.

?[1] Sorry if it sounds harsh, certainly not the intention. I'm just a bit
confused/overwhelmed.?

Thanks,
-- Ionel Cristian M?rie?, http://blog.ionelmc.ro
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/web-sig/attachments/20160310/7f49bada/attachment-0001.html>

From andrew at aeracode.org  Thu Mar 10 13:36:54 2016
From: andrew at aeracode.org (Andrew Godwin)
Date: Thu, 10 Mar 2016 10:36:54 -0800
Subject: [Web-SIG] Inviting feedback on my proposed "ASGI" spec
In-Reply-To: <D681E87B-4CD6-437D-8E56-CBFF90AB8231@lukasa.co.uk>
References: <CAFwN1uoAC0ROPhz4EX_+uMeQCyUvgG7-qpAFUO7C-kimS9ZtNg@mail.gmail.com>
 <D681E87B-4CD6-437D-8E56-CBFF90AB8231@lukasa.co.uk>
Message-ID: <CAFwN1up_cK_QVtxkw29WOjUOsUx0RZjvo2iTEAo_OD_oYFMjHQ@mail.gmail.com>

On Thu, Mar 10, 2016 at 1:59 AM, Cory Benfield <cory at lukasa.co.uk> wrote:

>
> > On 10 Mar 2016, at 00:34, Andrew Godwin <andrew at aeracode.org> wrote:
> >
> > To that end, I did some work to make the underlying mechanism Django
> Channels uses into more of a standard, which I have codenamed ASGI; while
> initially I intended for it to be a Django documented API, as I've gone
> further with the project I've come to believe it could be useful to the
> Python community at large.
> >
>
> Andrew,
>
> Thanks for this work! I?ve provided some proposed changes as pull requests
> against the channels repository. I?ll ignore those for the rest of the
> email: we can discuss them on GitHub.
>
> I also have a few more general notes. I didn?t make PRs for these, mostly
> because they?re too ?vague? as feedback goes to be concretely handled by me.
>
> First, your HTTP section has request headers serialized to a dict and
> response headers serialized to a list of tuples. I?m not sure how I feel
> about that asymmetry: it might be cleaner just to use lists-of-tuples in
> both places and allow application frameworks to handle translation to
> dictionary if they require it.
>

I think you're right, and I've just been stubbornly trying to use a dict as
it's slightly "nicer". I honestly considered making both sides dict and
cookies the separate thing as they're the only special case, but I suspect
that multiple headers are one of those things that might turn out to be
useful for some broken client/new feature someday.


>
> Second, if it were me I?d remove the `status_text` field on the `Response`
> object. Custom status text is a terrible misfeature (especially as HTTP/2
> doesn?t support it), and in 99% of cases you?re just wasting data by
> repeatedly sending the default phrase that the server already knows.
>

Well, it IS optional; you only need to send it if you're changing it from
the default or providing an unusual new value (e.g. 418). We could change
the spec to say servers don't have to abide by it, too. I have done a
project in the past with custom reason phrases, that's all :)


>
> Third, you?re currently sending header fields with unicode names and byte
> string values. That?s understandable, but I wonder if it?s worthwhile
> trying to limit the behaviour of compliant servers in encoding/decoding
> those header fields. For example, you could assert that the unicode header
> names will always use the Latin-1 codec when encoding/decoding. This is
> mostly me being paranoid about poorly written apps/servers issuing bad
> bytes onto the network. I should note that RFC 7230 strictly limits header
> names to US-ASCII, but Latin-1 would be the defensive choice against
> already-badly-written apps.
>

Yes, it's perhaps an unwritten understanding that they're meant to be
encoded/decoded only to latin1, and I believe this is what Daphne does;
they're unicode mostly as that makes keying into the header dictionary much
nicer in py3/unicode_literals land, and because there's a clear encoding
way to handle them.


>
> Your section on server push is great, whoever wrote that is clearly a
> genius. ;)
>
> You define web socket data frames with an incrementing counter from zero,
> but also note that the maximum integer size is Python?s sys.maxint (you
> actually aren?t that clear about it, which might be a good idea). While
> this is *probably* not a problem, you may want to note that really long
> running or active web socket connections are at risk of exhausting the
> ?order? counter, and define a behaviour if that happens.
>

Ah, good catch. I'll specify a very high maximum order number for any
protocol and say it rolls over to 0 for the next one, and then I can modify
channels' global_ordering to expect that - I think that's the most sensible
approach here.


>
> Otherwise, this is an interesting specification. I?m certainly open to
> helping push it through the PEP process if you?d like assistance with that.
>
>
If we see some rough agreement on it, yes, I would love some help with that.

Andrew
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/web-sig/attachments/20160310/b0de0029/attachment.html>

From chris.dent at gmail.com  Thu Mar 10 13:57:14 2016
From: chris.dent at gmail.com (chris.dent at gmail.com)
Date: Thu, 10 Mar 2016 18:57:14 +0000 (GMT)
Subject: [Web-SIG] Inviting feedback on my proposed "ASGI" spec
In-Reply-To: <CAFwN1up_cK_QVtxkw29WOjUOsUx0RZjvo2iTEAo_OD_oYFMjHQ@mail.gmail.com>
References: <CAFwN1uoAC0ROPhz4EX_+uMeQCyUvgG7-qpAFUO7C-kimS9ZtNg@mail.gmail.com>
 <D681E87B-4CD6-437D-8E56-CBFF90AB8231@lukasa.co.uk>
 <CAFwN1up_cK_QVtxkw29WOjUOsUx0RZjvo2iTEAo_OD_oYFMjHQ@mail.gmail.com>
Message-ID: <alpine.OSX.2.20.1603101852410.60141@shine.home>

On Thu, 10 Mar 2016, Andrew Godwin wrote:

> I think you're right, and I've just been stubbornly trying to use a dict as
> it's slightly "nicer". I honestly considered making both sides dict and
> cookies the separate thing as they're the only special case, but I suspect
> that multiple headers are one of those things that might turn out to be
> useful for some broken client/new feature someday.

It sounds like you consider multiple headers of the same name in
request and response as some kind of bug or fault. It's not it is
perfectly legit and something I want to be able to do in my webbby
frameworks. Vary is the main one.

I know that I can join on ',' in a single header when it is
represented in a dict but "meh".

I totally agree that dicts are much nicer to work with, so I'm not
sure what the ideal solution is, but I just wanted to raise that
point about multiple headers. As you were. Carry on. etc.

-- 
Chris Dent                                   http://burningchrome.com/
                                 [...]

From andrew at aeracode.org  Thu Mar 10 14:32:46 2016
From: andrew at aeracode.org (Andrew Godwin)
Date: Thu, 10 Mar 2016 11:32:46 -0800
Subject: [Web-SIG] Inviting feedback on my proposed "ASGI" spec
In-Reply-To: <alpine.OSX.2.20.1603101852410.60141@shine.home>
References: <CAFwN1uoAC0ROPhz4EX_+uMeQCyUvgG7-qpAFUO7C-kimS9ZtNg@mail.gmail.com>
 <D681E87B-4CD6-437D-8E56-CBFF90AB8231@lukasa.co.uk>
 <CAFwN1up_cK_QVtxkw29WOjUOsUx0RZjvo2iTEAo_OD_oYFMjHQ@mail.gmail.com>
 <alpine.OSX.2.20.1603101852410.60141@shine.home>
Message-ID: <CAFwN1uo8zZeenQ1LLpSdJ89JzJQPomLQK9F2POiTuGq9kx2wvw@mail.gmail.com>

On Thu, Mar 10, 2016 at 10:57 AM, <chris.dent at gmail.com> wrote:

> On Thu, 10 Mar 2016, Andrew Godwin wrote:
>
> I think you're right, and I've just been stubbornly trying to use a dict as
>> it's slightly "nicer". I honestly considered making both sides dict and
>> cookies the separate thing as they're the only special case, but I suspect
>> that multiple headers are one of those things that might turn out to be
>> useful for some broken client/new feature someday.
>>
>
> It sounds like you consider multiple headers of the same name in
> request and response as some kind of bug or fault. It's not it is
> perfectly legit and something I want to be able to do in my webbby
> frameworks. Vary is the main one.
>
> I know that I can join on ',' in a single header when it is
> represented in a dict but "meh".
>

Well, the protocol server would be the thing that's doing the joining if it
sees multiple headers - you'd always see comma-joined headers from clients
as an ASGI application, which I like as I like consistency.


>
> I totally agree that dicts are much nicer to work with, so I'm not
> sure what the ideal solution is, but I just wanted to raise that
> point about multiple headers. As you were. Carry on. etc.


Yeah, I find the whole comma thing a bit weird, and sort of wonder if it's
actually a workable thing for all HTTP clients. I hope it is.

Andrew
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/web-sig/attachments/20160310/eb650e33/attachment-0001.html>

From robertc at robertcollins.net  Thu Mar 10 16:27:19 2016
From: robertc at robertcollins.net (Robert Collins)
Date: Fri, 11 Mar 2016 10:27:19 +1300
Subject: [Web-SIG] Inviting feedback on my proposed "ASGI" spec
In-Reply-To: <CAFwN1uo8zZeenQ1LLpSdJ89JzJQPomLQK9F2POiTuGq9kx2wvw@mail.gmail.com>
References: <CAFwN1uoAC0ROPhz4EX_+uMeQCyUvgG7-qpAFUO7C-kimS9ZtNg@mail.gmail.com>
 <D681E87B-4CD6-437D-8E56-CBFF90AB8231@lukasa.co.uk>
 <CAFwN1up_cK_QVtxkw29WOjUOsUx0RZjvo2iTEAo_OD_oYFMjHQ@mail.gmail.com>
 <alpine.OSX.2.20.1603101852410.60141@shine.home>
 <CAFwN1uo8zZeenQ1LLpSdJ89JzJQPomLQK9F2POiTuGq9kx2wvw@mail.gmail.com>
Message-ID: <CAJ3HoZ2f2TYU+yef8XtYaTkwc9bYn=6EkwrkdpBx9rfQ0bFnag@mail.gmail.com>

On 11 March 2016 at 08:32, Andrew Godwin <andrew at aeracode.org> wrote:
>
>
>
>
> Well, the protocol server would be the thing that's doing the joining if it
> sees multiple headers - you'd always see comma-joined headers from clients
> as an ASGI application, which I like as I like consistency.

For consistency, why not a dict unicode -> List[bytes]

?

-Rob

-- 
Robert Collins <rbtcollins at hpe.com>
Distinguished Technologist
HP Converged Cloud

From robertc at robertcollins.net  Thu Mar 10 16:30:01 2016
From: robertc at robertcollins.net (Robert Collins)
Date: Fri, 11 Mar 2016 10:30:01 +1300
Subject: [Web-SIG] Inviting feedback on my proposed "ASGI" spec
In-Reply-To: <CAFwN1uoAC0ROPhz4EX_+uMeQCyUvgG7-qpAFUO7C-kimS9ZtNg@mail.gmail.com>
References: <CAFwN1uoAC0ROPhz4EX_+uMeQCyUvgG7-qpAFUO7C-kimS9ZtNg@mail.gmail.com>
Message-ID: <CAJ3HoZ323HUDcjVGRfLtKpO9ehyrnr3bvG1F2zbjdBk_Vubtpw@mail.gmail.com>

On 10 March 2016 at 13:34, Andrew Godwin <andrew at aeracode.org> wrote:
> Hi all,
>
> As some of you may know, I've been working over the past few months to bring
> native WebSocket support to Django, via a project codenamed "Django
> Channels" - this is mostly the reason I've been involved in recent WSGI
> discussions.
>
> I'm personally of the opinion that WSGI works well for HTTP, with a few
> improvements we can roll into a 1.1, but that we also need something else
> that can support WebSockets and other future web protocols (e.g. WebRTC
> components).
>
> To that end, I did some work to make the underlying mechanism Django
> Channels uses into more of a standard, which I have codenamed ASGI; while
> initially I intended for it to be a Django documented API, as I've gone
> further with the project I've come to believe it could be useful to the
> Python community at large.


I realise this may sound bikesheddy, but it would be really good to
not call it ASGI. From your docs "
Despite the name of the proposal, ASGI does not specify or design to
any specific in-process async solution, such as asyncio, twisted, or
gevent. Instead, the receive_many function can be switched between
nonblocking or synchronous. This approach allows applications to
choose what?s best for their current runtime environment; further
improvements may provide extensions where cooperative versions of
receive_many are provided."

I'm worried that folk will assume a parallel between ASGI and asyncio,
but there appears to be none... which is only a problem due to the
room for confusion.

-Rob

-- 
Robert Collins <rbtcollins at hpe.com>
Distinguished Technologist
HP Converged Cloud

From andrew at aeracode.org  Thu Mar 10 16:34:29 2016
From: andrew at aeracode.org (Andrew Godwin)
Date: Thu, 10 Mar 2016 13:34:29 -0800
Subject: [Web-SIG] Inviting feedback on my proposed "ASGI" spec
In-Reply-To: <CAJ3HoZ323HUDcjVGRfLtKpO9ehyrnr3bvG1F2zbjdBk_Vubtpw@mail.gmail.com>
References: <CAFwN1uoAC0ROPhz4EX_+uMeQCyUvgG7-qpAFUO7C-kimS9ZtNg@mail.gmail.com>
 <CAJ3HoZ323HUDcjVGRfLtKpO9ehyrnr3bvG1F2zbjdBk_Vubtpw@mail.gmail.com>
Message-ID: <CAFwN1up5M6RAGEtstmDj=CiQT4mhgtpmcJ32xHFp-R7PccsZOg@mail.gmail.com>

>
>
>
> I realise this may sound bikesheddy, but it would be really good to
> not call it ASGI. From your docs "
> Despite the name of the proposal, ASGI does not specify or design to
> any specific in-process async solution, such as asyncio, twisted, or
> gevent. Instead, the receive_many function can be switched between
> nonblocking or synchronous. This approach allows applications to
> choose what?s best for their current runtime environment; further
> improvements may provide extensions where cooperative versions of
> receive_many are provided."
>
> I'm worried that folk will assume a parallel between ASGI and asyncio,
> but there appears to be none... which is only a problem due to the
> room for confusion.


Better names are welcome, but I quite like ASGI's similarity to WSGI, and
the fact it's pronounceable as a single word. The "Asynchronous" part
covers the way the whole system operates; async is already an overloaded
term, and while there might be initial confusion, I think "async" also has
strong associations with the sort of problems ASGI solves (like
websockets), which I think is useful.

>  For consistency, why not a dict unicode -> List[bytes]

I personally think this is worse than a list of tuples (which you can at
least feed straight into dict()) - the only header that comes through as
multiple, ever, is Set-Cookie, after all.

Andrew
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/web-sig/attachments/20160310/f7cb5838/attachment.html>

From robertc at robertcollins.net  Thu Mar 10 17:07:48 2016
From: robertc at robertcollins.net (Robert Collins)
Date: Fri, 11 Mar 2016 11:07:48 +1300
Subject: [Web-SIG] Inviting feedback on my proposed "ASGI" spec
In-Reply-To: <CAFwN1up5M6RAGEtstmDj=CiQT4mhgtpmcJ32xHFp-R7PccsZOg@mail.gmail.com>
References: <CAFwN1uoAC0ROPhz4EX_+uMeQCyUvgG7-qpAFUO7C-kimS9ZtNg@mail.gmail.com>
 <CAJ3HoZ323HUDcjVGRfLtKpO9ehyrnr3bvG1F2zbjdBk_Vubtpw@mail.gmail.com>
 <CAFwN1up5M6RAGEtstmDj=CiQT4mhgtpmcJ32xHFp-R7PccsZOg@mail.gmail.com>
Message-ID: <CAJ3HoZ21oSLfW9FPjiRVo7YppgkhuaERMivpKvjt+wMaSNV8Jw@mail.gmail.com>

On 11 March 2016 at 10:34, Andrew Godwin <andrew at aeracode.org> wrote:
>>
>>
>> I realise this may sound bikesheddy, but it would be really good to
>> not call it ASGI. From your docs "
>> Despite the name of the proposal, ASGI does not specify or design to
>> any specific in-process async solution, such as asyncio, twisted, or
>> gevent. Instead, the receive_many function can be switched between
>> nonblocking or synchronous. This approach allows applications to
>> choose what?s best for their current runtime environment; further
>> improvements may provide extensions where cooperative versions of
>> receive_many are provided."
>>
>> I'm worried that folk will assume a parallel between ASGI and asyncio,
>> but there appears to be none... which is only a problem due to the
>> room for confusion.
>
>
> Better names are welcome, but I quite like ASGI's similarity to WSGI, and
> the fact it's pronounceable as a single word. The "Asynchronous" part covers
> the way the whole system operates; async is already an overloaded term, and
> while there might be initial confusion, I think "async" also has strong
> associations with the sort of problems ASGI solves (like websockets), which
> I think is useful.

Perhaps thats a particularly browser-centric view? There's nothing
that strongly associates TCP with Python's slant on 'async' for me -
interfaces on top of message passing can be sync or async - as in fact
the switch you've got demonstrates :).

Other names?

quick thoughts...
WSGP (web services gateway protocol)
MuPGI (multiple protocol gateway interface)


>>  For consistency, why not a dict unicode -> List[bytes]
>
> I personally think this is worse than a list of tuples (which you can at
> least feed straight into dict()) - the only header that comes through as
> multiple, ever, is Set-Cookie, after all.

I think you're wrong about that 'only header' statement.

rfc 7230 3.2.2 permits multiple header fields with the same field name
for all field values defined as comma separated lists, and for
set-cookie.

So  you can't feed it straight into dict, unless you place a
requirement on the server to always fold together multiple header
fields with the same field name.... and clients to not use that
either. Oh, and special case Set-cookie.

-Rob


-- 
Robert Collins <rbtcollins at hpe.com>
Distinguished Technologist
HP Converged Cloud

From andrew at aeracode.org  Thu Mar 10 18:56:00 2016
From: andrew at aeracode.org (Andrew Godwin)
Date: Thu, 10 Mar 2016 15:56:00 -0800
Subject: [Web-SIG] Inviting feedback on my proposed "ASGI" spec
In-Reply-To: <CAJ3HoZ21oSLfW9FPjiRVo7YppgkhuaERMivpKvjt+wMaSNV8Jw@mail.gmail.com>
References: <CAFwN1uoAC0ROPhz4EX_+uMeQCyUvgG7-qpAFUO7C-kimS9ZtNg@mail.gmail.com>
 <CAJ3HoZ323HUDcjVGRfLtKpO9ehyrnr3bvG1F2zbjdBk_Vubtpw@mail.gmail.com>
 <CAFwN1up5M6RAGEtstmDj=CiQT4mhgtpmcJ32xHFp-R7PccsZOg@mail.gmail.com>
 <CAJ3HoZ21oSLfW9FPjiRVo7YppgkhuaERMivpKvjt+wMaSNV8Jw@mail.gmail.com>
Message-ID: <CAFwN1urEhGtZeghOwTDi_ZntGr6SQA75d02nbPZ1UXY4FN=1uQ@mail.gmail.com>

On Thu, Mar 10, 2016 at 2:07 PM, Robert Collins <robertc at robertcollins.net>
wrote:

> On 11 March 2016 at 10:34, Andrew Godwin <andrew at aeracode.org> wrote:
> >>
> >>
> >> I realise this may sound bikesheddy, but it would be really good to
> >> not call it ASGI. From your docs "
> >> Despite the name of the proposal, ASGI does not specify or design to
> >> any specific in-process async solution, such as asyncio, twisted, or
> >> gevent. Instead, the receive_many function can be switched between
> >> nonblocking or synchronous. This approach allows applications to
> >> choose what?s best for their current runtime environment; further
> >> improvements may provide extensions where cooperative versions of
> >> receive_many are provided."
> >>
> >> I'm worried that folk will assume a parallel between ASGI and asyncio,
> >> but there appears to be none... which is only a problem due to the
> >> room for confusion.
> >
> >
> > Better names are welcome, but I quite like ASGI's similarity to WSGI, and
> > the fact it's pronounceable as a single word. The "Asynchronous" part
> covers
> > the way the whole system operates; async is already an overloaded term,
> and
> > while there might be initial confusion, I think "async" also has strong
> > associations with the sort of problems ASGI solves (like websockets),
> which
> > I think is useful.
>
> Perhaps thats a particularly browser-centric view? There's nothing
> that strongly associates TCP with Python's slant on 'async' for me -
> interfaces on top of message passing can be sync or async - as in fact
> the switch you've got demonstrates :).
>
> Other names?
>
> quick thoughts...
> WSGP (web services gateway protocol)
> MuPGI (multiple protocol gateway interface)


Maybe, but this is specifically oriented as a web-based protocol - I'm not
proposing to replace all network processing here - and in that context,
"async" largely means "I can do things outside a normal request-response
process".

I guess it would take a lot for me to change the name at this point, as
it's already so many places, but I do see your point.


>
>
>
> >>  For consistency, why not a dict unicode -> List[bytes]
> >
> > I personally think this is worse than a list of tuples (which you can at
> > least feed straight into dict()) - the only header that comes through as
> > multiple, ever, is Set-Cookie, after all.
>
> I think you're wrong about that 'only header' statement.
>
> rfc 7230 3.2.2 permits multiple header fields with the same field name
> for all field values defined as comma separated lists, and for
> set-cookie.
>
> So  you can't feed it straight into dict, unless you place a
> requirement on the server to always fold together multiple header
> fields with the same field name.... and clients to not use that
> either. Oh, and special case Set-cookie.
>

I would indeed want to require servers to always fold headers together into
a comma-separated list, as that's what the RFC says, and it then means
applications only have to deal with one kind of multi-header!

Set-cookie is the annoying thing here, though. That's why it's dict inbound
and list of tuples outbound right now, and I just don't know if I want to
make the inbound one a list of tuples too, given I do definitely want to
force servers to concat headers together (unless I find any examples of
that screwing things up)

Andrew
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/web-sig/attachments/20160310/0b3a6024/attachment-0001.html>

From cory at lukasa.co.uk  Fri Mar 11 05:28:35 2016
From: cory at lukasa.co.uk (Cory Benfield)
Date: Fri, 11 Mar 2016 10:28:35 +0000
Subject: [Web-SIG] Inviting feedback on my proposed "ASGI" spec
In-Reply-To: <CAFwN1urEhGtZeghOwTDi_ZntGr6SQA75d02nbPZ1UXY4FN=1uQ@mail.gmail.com>
References: <CAFwN1uoAC0ROPhz4EX_+uMeQCyUvgG7-qpAFUO7C-kimS9ZtNg@mail.gmail.com>
 <CAJ3HoZ323HUDcjVGRfLtKpO9ehyrnr3bvG1F2zbjdBk_Vubtpw@mail.gmail.com>
 <CAFwN1up5M6RAGEtstmDj=CiQT4mhgtpmcJ32xHFp-R7PccsZOg@mail.gmail.com>
 <CAJ3HoZ21oSLfW9FPjiRVo7YppgkhuaERMivpKvjt+wMaSNV8Jw@mail.gmail.com>
 <CAFwN1urEhGtZeghOwTDi_ZntGr6SQA75d02nbPZ1UXY4FN=1uQ@mail.gmail.com>
Message-ID: <C23EE138-342E-46EF-8B70-85784CBAAB7F@lukasa.co.uk>


> On 10 Mar 2016, at 23:56, Andrew Godwin <andrew at aeracode.org> wrote:
> 
> I would indeed want to require servers to always fold headers together into a comma-separated list, as that's what the RFC says, and it then means applications only have to deal with one kind of multi-header!

Wellllll?.kinda?

The RFC says that multiple headers are *semantically equivalent* to the joined form, but does not in any sense require that it be done. (The normative language in RFC 7230 is MAY.)

I had this discussion recently with Brian Smith: while there is only one correct way to fold/unfold headers, anywhere on the spectrum between completely folded and completely unfolded is a perfectly valid representation of the HTTP header block. This means that there?s no *rules* about how a server is supposed to do it, at least from the IETF. ASGI is of course totally allowed to add its own rules, and requiring that they be folded is not terrible.

FWIW, in my experience, I?ve found that ?list of tuples? is really the most likely to be correct way to represent a header block, because it provides some assurances to the user that the header block has not been aggressively transformed from how it was sent on the wire. While the *rules* are that the folded representation is supposed to be semantically equivalent to the unfolded representation, there is nonetheless some information implicit in those headers being separate.

My intuition when writing this kind of thing is to pass applications (like Django) the most meaningful representation I can, and then allow the application to make its own decisions about what meaning they?re willing to lose. That?s why I?d advocate for ?list of two-tuples of bytestrings? as the representation. However, I don?t think there?s anything *wrong* with forcing the headers to be joined by the server where possible: it?s just not how I?d do it. ;)

> Set-cookie is the annoying thing here, though. That's why it's dict inbound and list of tuples outbound right now, and I just don't know if I want to make the inbound one a list of tuples too, given I do definitely want to force servers to concat headers together (unless I find any examples of that screwing things up)

You could make the inbound one a list of tuples but still require that the servers concat headers. The rule then would be that it needs to be possible for an application to say `dict(headers)` without any loss of meaning.

Cory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/web-sig/attachments/20160311/ee1a0665/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/web-sig/attachments/20160311/ee1a0665/attachment.sig>

From cory at lukasa.co.uk  Fri Mar 11 05:28:36 2016
From: cory at lukasa.co.uk (Cory Benfield)
Date: Fri, 11 Mar 2016 10:28:36 +0000
Subject: [Web-SIG] Inviting feedback on my proposed "ASGI" spec
In-Reply-To: <CAFwN1up_cK_QVtxkw29WOjUOsUx0RZjvo2iTEAo_OD_oYFMjHQ@mail.gmail.com>
References: <CAFwN1uoAC0ROPhz4EX_+uMeQCyUvgG7-qpAFUO7C-kimS9ZtNg@mail.gmail.com>
 <D681E87B-4CD6-437D-8E56-CBFF90AB8231@lukasa.co.uk>
 <CAFwN1up_cK_QVtxkw29WOjUOsUx0RZjvo2iTEAo_OD_oYFMjHQ@mail.gmail.com>
Message-ID: <FF378BB0-1845-4136-A265-FE305FCABC99@lukasa.co.uk>


> On 10 Mar 2016, at 18:36, Andrew Godwin <andrew at aeracode.org> wrote:
> 
> 
> Second, if it were me I?d remove the `status_text` field on the `Response` object. Custom status text is a terrible misfeature (especially as HTTP/2 doesn?t support it), and in 99% of cases you?re just wasting data by repeatedly sending the default phrase that the server already knows.
> 
> Well, it IS optional; you only need to send it if you're changing it from the default or providing an unusual new value (e.g. 418). We could change the spec to say servers don't have to abide by it, too. I have done a project in the past with custom reason phrases, that's all :)

You monster! ;)

For what it?s worth, I object to the use of reason phrases because, as with all things in HTTP, they were far-too-broadly specified. The rules for parsing the reason phrase are super broad (the reason phrase allows \t, space, and then all bytes from 0x21 to 0xFF *excluding* 0x7F (ASCII DEL). This means that it?s sometimes possible to encode a reason phrase containing non-ASCII/non-Latin-1 codepoints in UTF-8 (I?ve seen this happen), and then everything gets really terrible really fast.

IMO, I think almost nothing would be lost by just quietly removing it from the specification. The only loss is in setting ?unusual? values, and FWIW I think that?s *also* unwise: if it can?t be found here[0] then the unusual status code is nothing but vanity, because it?s no more precise than the X00 version that already exists (no user agent can take action on it).

Again, just my 2?.

Cory


[0]: https://www.iana.org/assignments/http-status-codes/http-status-codes.xhtml
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/web-sig/attachments/20160311/51187b8f/attachment.sig>

From cmawebsite at gmail.com  Fri Mar 11 10:45:05 2016
From: cmawebsite at gmail.com (Collin Anderson)
Date: Fri, 11 Mar 2016 10:45:05 -0500
Subject: [Web-SIG] Inviting feedback on my proposed "ASGI" spec
In-Reply-To: <FF378BB0-1845-4136-A265-FE305FCABC99@lukasa.co.uk>
References: <CAFwN1uoAC0ROPhz4EX_+uMeQCyUvgG7-qpAFUO7C-kimS9ZtNg@mail.gmail.com>
 <D681E87B-4CD6-437D-8E56-CBFF90AB8231@lukasa.co.uk>
 <CAFwN1up_cK_QVtxkw29WOjUOsUx0RZjvo2iTEAo_OD_oYFMjHQ@mail.gmail.com>
 <FF378BB0-1845-4136-A265-FE305FCABC99@lukasa.co.uk>
Message-ID: <CAFO84S6wyOLdGM+1sBLdjko8B86BkuL0zbMyF1eoZHQjkPPK_Q@mail.gmail.com>

Just a thought from a non-wsgi developer: I think it might be smart to
follow http2 when in doubt on a question:

- http2 preserves header order and allows duplicates in both directions. A
list of tuples seems to be the best data structure IMHO.

- http2 ignores reason phrases, which makes me think discarding it wouldn't
be a problem for the new standard.

On Fri, Mar 11, 2016 at 5:28 AM, Cory Benfield <cory at lukasa.co.uk> wrote:

>
> > On 10 Mar 2016, at 18:36, Andrew Godwin <andrew at aeracode.org> wrote:
> >
> >
> > Second, if it were me I?d remove the `status_text` field on the
> `Response` object. Custom status text is a terrible misfeature (especially
> as HTTP/2 doesn?t support it), and in 99% of cases you?re just wasting data
> by repeatedly sending the default phrase that the server already knows.
> >
> > Well, it IS optional; you only need to send it if you're changing it
> from the default or providing an unusual new value (e.g. 418). We could
> change the spec to say servers don't have to abide by it, too. I have done
> a project in the past with custom reason phrases, that's all :)
>
> You monster! ;)
>
> For what it?s worth, I object to the use of reason phrases because, as
> with all things in HTTP, they were far-too-broadly specified. The rules for
> parsing the reason phrase are super broad (the reason phrase allows \t,
> space, and then all bytes from 0x21 to 0xFF *excluding* 0x7F (ASCII DEL).
> This means that it?s sometimes possible to encode a reason phrase
> containing non-ASCII/non-Latin-1 codepoints in UTF-8 (I?ve seen this
> happen), and then everything gets really terrible really fast.
>
> IMO, I think almost nothing would be lost by just quietly removing it from
> the specification. The only loss is in setting ?unusual? values, and FWIW I
> think that?s *also* unwise: if it can?t be found here[0] then the unusual
> status code is nothing but vanity, because it?s no more precise than the
> X00 version that already exists (no user agent can take action on it).
>
> Again, just my 2?.
>
> Cory
>
>
> [0]:
> https://www.iana.org/assignments/http-status-codes/http-status-codes.xhtml
>
> _______________________________________________
> Web-SIG mailing list
> Web-SIG at python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:
> https://mail.python.org/mailman/options/web-sig/cmawebsite%40gmail.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/web-sig/attachments/20160311/18b38046/attachment-0001.html>

From andrew at aeracode.org  Fri Mar 11 12:56:22 2016
From: andrew at aeracode.org (Andrew Godwin)
Date: Fri, 11 Mar 2016 09:56:22 -0800
Subject: [Web-SIG] Inviting feedback on my proposed "ASGI" spec
In-Reply-To: <C23EE138-342E-46EF-8B70-85784CBAAB7F@lukasa.co.uk>
References: <CAFwN1uoAC0ROPhz4EX_+uMeQCyUvgG7-qpAFUO7C-kimS9ZtNg@mail.gmail.com>
 <CAJ3HoZ323HUDcjVGRfLtKpO9ehyrnr3bvG1F2zbjdBk_Vubtpw@mail.gmail.com>
 <CAFwN1up5M6RAGEtstmDj=CiQT4mhgtpmcJ32xHFp-R7PccsZOg@mail.gmail.com>
 <CAJ3HoZ21oSLfW9FPjiRVo7YppgkhuaERMivpKvjt+wMaSNV8Jw@mail.gmail.com>
 <CAFwN1urEhGtZeghOwTDi_ZntGr6SQA75d02nbPZ1UXY4FN=1uQ@mail.gmail.com>
 <C23EE138-342E-46EF-8B70-85784CBAAB7F@lukasa.co.uk>
Message-ID: <CAFwN1urDnMX1JJU_agadxq+mn9LA9fzSbkX_WjVD1veyu8UvAQ@mail.gmail.com>

On Fri, Mar 11, 2016 at 2:28 AM, Cory Benfield <cory at lukasa.co.uk> wrote:

>
> On 10 Mar 2016, at 23:56, Andrew Godwin <andrew at aeracode.org> wrote:
>
> I would indeed want to require servers to always fold headers together
> into a comma-separated list, as that's what the RFC says, and it then means
> applications only have to deal with one kind of multi-header!
>
>
> Wellllll?.kinda?
>
> The RFC says that multiple headers are *semantically equivalent* to the
> joined form, but does not in any sense require that it be done. (The
> normative language in RFC 7230 is MAY.)
>
> I had this discussion recently with Brian Smith: while there is only one
> correct way to fold/unfold headers, anywhere on the spectrum between
> completely folded and completely unfolded is a perfectly valid
> representation of the HTTP header block. This means that there?s no *rules*
> about how a server is supposed to do it, at least from the IETF. ASGI is of
> course totally allowed to add its own rules, and requiring that they be
> folded is not terrible.
>
> FWIW, in my experience, I?ve found that ?list of tuples? is really the
> most likely to be correct way to represent a header block, because it
> provides some assurances to the user that the header block has not been
> aggressively transformed from how it was sent on the wire. While the
> *rules* are that the folded representation is supposed to be semantically
> equivalent to the unfolded representation, there is nonetheless some
> information implicit in those headers being separate.
>
> My intuition when writing this kind of thing is to pass applications (like
> Django) the most meaningful representation I can, and then allow the
> application to make its own decisions about what meaning they?re willing to
> lose. That?s why I?d advocate for ?list of two-tuples of bytestrings? as
> the representation. However, I don?t think there?s anything *wrong* with
> forcing the headers to be joined by the server where possible: it?s just
> not how I?d do it. ;)
>
> Set-cookie is the annoying thing here, though. That's why it's dict
> inbound and list of tuples outbound right now, and I just don't know if I
> want to make the inbound one a list of tuples too, given I do definitely
> want to force servers to concat headers together (unless I find any
> examples of that screwing things up)
>
>
> You could make the inbound one a list of tuples but still require that the
> servers concat headers. The rule then would be that it needs to be possible
> for an application to say `dict(headers)` without any loss of meaning.
>

Yes, I think this is a good argument - my worry has always been that the
"no multiples" is more of a soft rule that some clients might break or some
apps might rely on the ordering/multiplicity of things, so preserving it is
_probably_ helpful (and as you say, it lets the header names go back to
bytestrings).

I'll modify the spec and then update Daphne and Channels to match; I can
leave Channels parsing both types for a bit, at least.

Collin's point about http2's handling of headers is on point, too - if the
new spec is deliberately thinned down to that point but no further, it's
probably wise to follow them since they know much more about it than I do.

Andrew
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/web-sig/attachments/20160311/84808c2f/attachment.html>

From andrew at aeracode.org  Fri Mar 11 12:59:59 2016
From: andrew at aeracode.org (Andrew Godwin)
Date: Fri, 11 Mar 2016 09:59:59 -0800
Subject: [Web-SIG] Inviting feedback on my proposed "ASGI" spec
In-Reply-To: <CAFwN1urDnMX1JJU_agadxq+mn9LA9fzSbkX_WjVD1veyu8UvAQ@mail.gmail.com>
References: <CAFwN1uoAC0ROPhz4EX_+uMeQCyUvgG7-qpAFUO7C-kimS9ZtNg@mail.gmail.com>
 <CAJ3HoZ323HUDcjVGRfLtKpO9ehyrnr3bvG1F2zbjdBk_Vubtpw@mail.gmail.com>
 <CAFwN1up5M6RAGEtstmDj=CiQT4mhgtpmcJ32xHFp-R7PccsZOg@mail.gmail.com>
 <CAJ3HoZ21oSLfW9FPjiRVo7YppgkhuaERMivpKvjt+wMaSNV8Jw@mail.gmail.com>
 <CAFwN1urEhGtZeghOwTDi_ZntGr6SQA75d02nbPZ1UXY4FN=1uQ@mail.gmail.com>
 <C23EE138-342E-46EF-8B70-85784CBAAB7F@lukasa.co.uk>
 <CAFwN1urDnMX1JJU_agadxq+mn9LA9fzSbkX_WjVD1veyu8UvAQ@mail.gmail.com>
Message-ID: <CAFwN1uoobJnHCEMOn7cgaKy=6dbjQXWwyc5E9ALVmj7brg50Zg@mail.gmail.com>

One thing I did want to ask - is it worth still squashing everything down
to the same case? Daphne already clears out headers with _ in them to avoid
that CVE about it, and header case is never semantic, or so I thought?

Andrew

On Fri, Mar 11, 2016 at 9:56 AM, Andrew Godwin <andrew at aeracode.org> wrote:

>
>
> On Fri, Mar 11, 2016 at 2:28 AM, Cory Benfield <cory at lukasa.co.uk> wrote:
>
>>
>> On 10 Mar 2016, at 23:56, Andrew Godwin <andrew at aeracode.org> wrote:
>>
>> I would indeed want to require servers to always fold headers together
>> into a comma-separated list, as that's what the RFC says, and it then means
>> applications only have to deal with one kind of multi-header!
>>
>>
>> Wellllll?.kinda?
>>
>> The RFC says that multiple headers are *semantically equivalent* to the
>> joined form, but does not in any sense require that it be done. (The
>> normative language in RFC 7230 is MAY.)
>>
>> I had this discussion recently with Brian Smith: while there is only one
>> correct way to fold/unfold headers, anywhere on the spectrum between
>> completely folded and completely unfolded is a perfectly valid
>> representation of the HTTP header block. This means that there?s no *rules*
>> about how a server is supposed to do it, at least from the IETF. ASGI is of
>> course totally allowed to add its own rules, and requiring that they be
>> folded is not terrible.
>>
>> FWIW, in my experience, I?ve found that ?list of tuples? is really the
>> most likely to be correct way to represent a header block, because it
>> provides some assurances to the user that the header block has not been
>> aggressively transformed from how it was sent on the wire. While the
>> *rules* are that the folded representation is supposed to be semantically
>> equivalent to the unfolded representation, there is nonetheless some
>> information implicit in those headers being separate.
>>
>> My intuition when writing this kind of thing is to pass applications
>> (like Django) the most meaningful representation I can, and then allow the
>> application to make its own decisions about what meaning they?re willing to
>> lose. That?s why I?d advocate for ?list of two-tuples of bytestrings? as
>> the representation. However, I don?t think there?s anything *wrong* with
>> forcing the headers to be joined by the server where possible: it?s just
>> not how I?d do it. ;)
>>
>> Set-cookie is the annoying thing here, though. That's why it's dict
>> inbound and list of tuples outbound right now, and I just don't know if I
>> want to make the inbound one a list of tuples too, given I do definitely
>> want to force servers to concat headers together (unless I find any
>> examples of that screwing things up)
>>
>>
>> You could make the inbound one a list of tuples but still require that
>> the servers concat headers. The rule then would be that it needs to be
>> possible for an application to say `dict(headers)` without any loss of
>> meaning.
>>
>
> Yes, I think this is a good argument - my worry has always been that the
> "no multiples" is more of a soft rule that some clients might break or some
> apps might rely on the ordering/multiplicity of things, so preserving it is
> _probably_ helpful (and as you say, it lets the header names go back to
> bytestrings).
>
> I'll modify the spec and then update Daphne and Channels to match; I can
> leave Channels parsing both types for a bit, at least.
>
> Collin's point about http2's handling of headers is on point, too - if the
> new spec is deliberately thinned down to that point but no further, it's
> probably wise to follow them since they know much more about it than I do.
>
> Andrew
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/web-sig/attachments/20160311/6c561d64/attachment.html>

From cmawebsite at gmail.com  Fri Mar 11 13:03:35 2016
From: cmawebsite at gmail.com (Collin Anderson)
Date: Fri, 11 Mar 2016 13:03:35 -0500
Subject: [Web-SIG] Inviting feedback on my proposed "ASGI" spec
In-Reply-To: <CAFwN1uoobJnHCEMOn7cgaKy=6dbjQXWwyc5E9ALVmj7brg50Zg@mail.gmail.com>
References: <CAFwN1uoAC0ROPhz4EX_+uMeQCyUvgG7-qpAFUO7C-kimS9ZtNg@mail.gmail.com>
 <CAJ3HoZ323HUDcjVGRfLtKpO9ehyrnr3bvG1F2zbjdBk_Vubtpw@mail.gmail.com>
 <CAFwN1up5M6RAGEtstmDj=CiQT4mhgtpmcJ32xHFp-R7PccsZOg@mail.gmail.com>
 <CAJ3HoZ21oSLfW9FPjiRVo7YppgkhuaERMivpKvjt+wMaSNV8Jw@mail.gmail.com>
 <CAFwN1urEhGtZeghOwTDi_ZntGr6SQA75d02nbPZ1UXY4FN=1uQ@mail.gmail.com>
 <C23EE138-342E-46EF-8B70-85784CBAAB7F@lukasa.co.uk>
 <CAFwN1urDnMX1JJU_agadxq+mn9LA9fzSbkX_WjVD1veyu8UvAQ@mail.gmail.com>
 <CAFwN1uoobJnHCEMOn7cgaKy=6dbjQXWwyc5E9ALVmj7brg50Zg@mail.gmail.com>
Message-ID: <CAFO84S4WhX_iJTFiAw+vVMbaZyn95jNEcUfRMkSX3G1aCSKWUw@mail.gmail.com>

http2 makes all header names lowercase

On Fri, Mar 11, 2016 at 12:59 PM, Andrew Godwin <andrew at aeracode.org> wrote:

> One thing I did want to ask - is it worth still squashing everything down
> to the same case? Daphne already clears out headers with _ in them to avoid
> that CVE about it, and header case is never semantic, or so I thought?
>
> Andrew
>
> On Fri, Mar 11, 2016 at 9:56 AM, Andrew Godwin <andrew at aeracode.org>
> wrote:
>
>>
>>
>> On Fri, Mar 11, 2016 at 2:28 AM, Cory Benfield <cory at lukasa.co.uk> wrote:
>>
>>>
>>> On 10 Mar 2016, at 23:56, Andrew Godwin <andrew at aeracode.org> wrote:
>>>
>>> I would indeed want to require servers to always fold headers together
>>> into a comma-separated list, as that's what the RFC says, and it then means
>>> applications only have to deal with one kind of multi-header!
>>>
>>>
>>> Wellllll?.kinda?
>>>
>>> The RFC says that multiple headers are *semantically equivalent* to the
>>> joined form, but does not in any sense require that it be done. (The
>>> normative language in RFC 7230 is MAY.)
>>>
>>> I had this discussion recently with Brian Smith: while there is only one
>>> correct way to fold/unfold headers, anywhere on the spectrum between
>>> completely folded and completely unfolded is a perfectly valid
>>> representation of the HTTP header block. This means that there?s no *rules*
>>> about how a server is supposed to do it, at least from the IETF. ASGI is of
>>> course totally allowed to add its own rules, and requiring that they be
>>> folded is not terrible.
>>>
>>> FWIW, in my experience, I?ve found that ?list of tuples? is really the
>>> most likely to be correct way to represent a header block, because it
>>> provides some assurances to the user that the header block has not been
>>> aggressively transformed from how it was sent on the wire. While the
>>> *rules* are that the folded representation is supposed to be semantically
>>> equivalent to the unfolded representation, there is nonetheless some
>>> information implicit in those headers being separate.
>>>
>>> My intuition when writing this kind of thing is to pass applications
>>> (like Django) the most meaningful representation I can, and then allow the
>>> application to make its own decisions about what meaning they?re willing to
>>> lose. That?s why I?d advocate for ?list of two-tuples of bytestrings? as
>>> the representation. However, I don?t think there?s anything *wrong* with
>>> forcing the headers to be joined by the server where possible: it?s just
>>> not how I?d do it. ;)
>>>
>>> Set-cookie is the annoying thing here, though. That's why it's dict
>>> inbound and list of tuples outbound right now, and I just don't know if I
>>> want to make the inbound one a list of tuples too, given I do definitely
>>> want to force servers to concat headers together (unless I find any
>>> examples of that screwing things up)
>>>
>>>
>>> You could make the inbound one a list of tuples but still require that
>>> the servers concat headers. The rule then would be that it needs to be
>>> possible for an application to say `dict(headers)` without any loss of
>>> meaning.
>>>
>>
>> Yes, I think this is a good argument - my worry has always been that the
>> "no multiples" is more of a soft rule that some clients might break or some
>> apps might rely on the ordering/multiplicity of things, so preserving it is
>> _probably_ helpful (and as you say, it lets the header names go back to
>> bytestrings).
>>
>> I'll modify the spec and then update Daphne and Channels to match; I can
>> leave Channels parsing both types for a bit, at least.
>>
>> Collin's point about http2's handling of headers is on point, too - if
>> the new spec is deliberately thinned down to that point but no further,
>> it's probably wise to follow them since they know much more about it than I
>> do.
>>
>> Andrew
>>
>
>
> _______________________________________________
> Web-SIG mailing list
> Web-SIG at python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:
> https://mail.python.org/mailman/options/web-sig/cmawebsite%40gmail.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/web-sig/attachments/20160311/8a07e5c9/attachment-0001.html>

From andrew at aeracode.org  Fri Mar 11 13:05:07 2016
From: andrew at aeracode.org (Andrew Godwin)
Date: Fri, 11 Mar 2016 10:05:07 -0800
Subject: [Web-SIG] Inviting feedback on my proposed "ASGI" spec
In-Reply-To: <CAFO84S4WhX_iJTFiAw+vVMbaZyn95jNEcUfRMkSX3G1aCSKWUw@mail.gmail.com>
References: <CAFwN1uoAC0ROPhz4EX_+uMeQCyUvgG7-qpAFUO7C-kimS9ZtNg@mail.gmail.com>
 <CAJ3HoZ323HUDcjVGRfLtKpO9ehyrnr3bvG1F2zbjdBk_Vubtpw@mail.gmail.com>
 <CAFwN1up5M6RAGEtstmDj=CiQT4mhgtpmcJ32xHFp-R7PccsZOg@mail.gmail.com>
 <CAJ3HoZ21oSLfW9FPjiRVo7YppgkhuaERMivpKvjt+wMaSNV8Jw@mail.gmail.com>
 <CAFwN1urEhGtZeghOwTDi_ZntGr6SQA75d02nbPZ1UXY4FN=1uQ@mail.gmail.com>
 <C23EE138-342E-46EF-8B70-85784CBAAB7F@lukasa.co.uk>
 <CAFwN1urDnMX1JJU_agadxq+mn9LA9fzSbkX_WjVD1veyu8UvAQ@mail.gmail.com>
 <CAFwN1uoobJnHCEMOn7cgaKy=6dbjQXWwyc5E9ALVmj7brg50Zg@mail.gmail.com>
 <CAFO84S4WhX_iJTFiAw+vVMbaZyn95jNEcUfRMkSX3G1aCSKWUw@mail.gmail.com>
Message-ID: <CAFwN1uprrFRrs7MT034Ln0awyxJU5xZkfvdZ_cpgzPUVGgvsFA@mail.gmail.com>

Yes, I thought that was the case. I think adding lowercase normalisation to
header names to the spec would be sensible (daphne already does this, but
I'd like to make it reliable upon)

Andrew

On Fri, Mar 11, 2016 at 10:03 AM, Collin Anderson <cmawebsite at gmail.com>
wrote:

> http2 makes all header names lowercase
>
> On Fri, Mar 11, 2016 at 12:59 PM, Andrew Godwin <andrew at aeracode.org>
> wrote:
>
>> One thing I did want to ask - is it worth still squashing everything down
>> to the same case? Daphne already clears out headers with _ in them to avoid
>> that CVE about it, and header case is never semantic, or so I thought?
>>
>> Andrew
>>
>> On Fri, Mar 11, 2016 at 9:56 AM, Andrew Godwin <andrew at aeracode.org>
>> wrote:
>>
>>>
>>>
>>> On Fri, Mar 11, 2016 at 2:28 AM, Cory Benfield <cory at lukasa.co.uk>
>>> wrote:
>>>
>>>>
>>>> On 10 Mar 2016, at 23:56, Andrew Godwin <andrew at aeracode.org> wrote:
>>>>
>>>> I would indeed want to require servers to always fold headers together
>>>> into a comma-separated list, as that's what the RFC says, and it then means
>>>> applications only have to deal with one kind of multi-header!
>>>>
>>>>
>>>> Wellllll?.kinda?
>>>>
>>>> The RFC says that multiple headers are *semantically equivalent* to the
>>>> joined form, but does not in any sense require that it be done. (The
>>>> normative language in RFC 7230 is MAY.)
>>>>
>>>> I had this discussion recently with Brian Smith: while there is only
>>>> one correct way to fold/unfold headers, anywhere on the spectrum between
>>>> completely folded and completely unfolded is a perfectly valid
>>>> representation of the HTTP header block. This means that there?s no *rules*
>>>> about how a server is supposed to do it, at least from the IETF. ASGI is of
>>>> course totally allowed to add its own rules, and requiring that they be
>>>> folded is not terrible.
>>>>
>>>> FWIW, in my experience, I?ve found that ?list of tuples? is really the
>>>> most likely to be correct way to represent a header block, because it
>>>> provides some assurances to the user that the header block has not been
>>>> aggressively transformed from how it was sent on the wire. While the
>>>> *rules* are that the folded representation is supposed to be semantically
>>>> equivalent to the unfolded representation, there is nonetheless some
>>>> information implicit in those headers being separate.
>>>>
>>>> My intuition when writing this kind of thing is to pass applications
>>>> (like Django) the most meaningful representation I can, and then allow the
>>>> application to make its own decisions about what meaning they?re willing to
>>>> lose. That?s why I?d advocate for ?list of two-tuples of bytestrings? as
>>>> the representation. However, I don?t think there?s anything *wrong* with
>>>> forcing the headers to be joined by the server where possible: it?s just
>>>> not how I?d do it. ;)
>>>>
>>>> Set-cookie is the annoying thing here, though. That's why it's dict
>>>> inbound and list of tuples outbound right now, and I just don't know if I
>>>> want to make the inbound one a list of tuples too, given I do definitely
>>>> want to force servers to concat headers together (unless I find any
>>>> examples of that screwing things up)
>>>>
>>>>
>>>> You could make the inbound one a list of tuples but still require that
>>>> the servers concat headers. The rule then would be that it needs to be
>>>> possible for an application to say `dict(headers)` without any loss of
>>>> meaning.
>>>>
>>>
>>> Yes, I think this is a good argument - my worry has always been that the
>>> "no multiples" is more of a soft rule that some clients might break or some
>>> apps might rely on the ordering/multiplicity of things, so preserving it is
>>> _probably_ helpful (and as you say, it lets the header names go back to
>>> bytestrings).
>>>
>>> I'll modify the spec and then update Daphne and Channels to match; I can
>>> leave Channels parsing both types for a bit, at least.
>>>
>>> Collin's point about http2's handling of headers is on point, too - if
>>> the new spec is deliberately thinned down to that point but no further,
>>> it's probably wise to follow them since they know much more about it than I
>>> do.
>>>
>>> Andrew
>>>
>>
>>
>> _______________________________________________
>> Web-SIG mailing list
>> Web-SIG at python.org
>> Web SIG: http://www.python.org/sigs/web-sig
>> Unsubscribe:
>> https://mail.python.org/mailman/options/web-sig/cmawebsite%40gmail.com
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/web-sig/attachments/20160311/a3aec8dd/attachment.html>

From jason.madden at nextthought.com  Thu Mar 24 11:18:04 2016
From: jason.madden at nextthought.com (Jason Madden)
Date: Thu, 24 Mar 2016 10:18:04 -0500
Subject: [Web-SIG] Any practical reason type(environ) must be dict (not
 subclass)?
Message-ID: <CE9ECA2A-E64A-464F-A016-16FB524D2DCC@nextthought.com>

Hi all,


Is there any practical reason that the type of the `environ` object must be exactly `dict`, as specified in PEP3333?  

I'm asking because it was recently pointed out that gevent's WSGI server can sometimes print `environ` (on certain error cases), but that can lead to sensitive information being kept in the server's logs (e.g., HTTP_AUTHORIZATION, HTTP_COOKIE, maybe other things). The simplest and most flexible way to prevent this from happening, not just inadvertently within gevent itself but also for client applications, I thought, was to have `environ` be a subclass of `dict` with a customized `__repr__` (much like WebOb does for MultiDict, and repoze.who does for Identity, both for similar reasons).

Unfortunately, when I implemented that in [0], I discovered that `wsgiref.validator` asserts that type(environ) is dict. I looked up the PEP, and sure enough, PEP 3333 states that environ "must be a builtin Python dictionary (not a subclass, UserDict or other dictionary emulation)." [1]

Background/History
==================

That seemed overly restrictive to me, so I tried to backtrack the history of that language in hopes of discovering the rationale. 

- It was present in the predecessor of PEP 3333, PEP 0333, in the first version committed to the repository in August 2004. [2] 
- Prior to that, it was in both drafts of what would become PEP 0333 posted to this mailing list, again from August 2004: [3], [4].
- The ancestor of those drafts, the "Python Web Container Interface v1.0" was posted in December of 2003 with somewhat less restrictive language: "the environ object *must* be a Python dictionary....The rationale for requiring a dictionary is to maximize portability
between containers" [5].

Now, the discussion on that earliest draft in [5] specifically brought up using other types that implement all the methods of a dictionary, like UserDict.DictMixin [6]. The last post on the subject in that thread seemed to be leaning towards accepting non-dict objects, at least if they were good enough [7].

By the time the draft became recognizable as the precursor to PEP 0333 in [3], the very strict language we have now was in place. That draft, however, specifically stated that it was intended to be compatible with Python 1.5.2. In Python 1.5.2, it wasn't possible to subclass the builtin dict, so imitations, like UserDict.DictMixin, were necessarily imprecise. This was later changed to the much-maligned Python 2.2.2 release [8]; Python 2.2 added the ability to subclass dict, but the language wasn't changed.

Today
=====

Given that today, we can subclass dict with full fidelity, is there still any practical reason not to be able to do so? I'm probably OK with gevent violating the letter of the spec in this regard, so long as there are no practical consequences. I was able to think of two possible objections, but both can be solved:

- Pickling the custom `environ` type and then loading it in another process might not work if the class is not available. I can imagine this coming up with Celery, for example. This is easily fixed by adding an appropriate `__reduce_ex__` implementation.

- Code somewhere relies on `if type(some_object) is dict:` (where `environ` became `some_object`, presumably through several levels of calls), instead of `isinstance(some_object, dict)` or `isinstance(some_object, collections.MutableMapping)`. The solution here is simply to not do that :) Pylint, among other linters, produces warnings if you do.

Can anyone think of any other practical reasons I've overlooked? Is this just a horrible idea for other reasons?

I appreciate any discussion!

Thanks,
Jason

[0] https://github.com/gevent/gevent/compare/secure-environ
[1] https://www.python.org/dev/peps/pep-3333/#specification-details
[2] https://github.com/python/peps/commit/d5864f018f58a35fa787492e6763e382f98b923c#diff-ff370d50af3db062b015d1ef85935779
[3] https://mail.python.org/pipermail/web-sig/2004-August/000518.html
[4] https://mail.python.org/pipermail/web-sig/2004-August/000562.html
[5] https://mail.python.org/pipermail/web-sig/2003-December/000394.html
[7] https://mail.python.org/pipermail/web-sig/2003-December/000401.html
[8] https://mail.python.org/pipermail/web-sig/2004-August/000565.html


From alan at xhaus.com  Thu Mar 24 12:09:20 2016
From: alan at xhaus.com (Alan Kennedy)
Date: Thu, 24 Mar 2016 16:09:20 +0000
Subject: [Web-SIG] Any practical reason type(environ) must be dict (not
 subclass)?
In-Reply-To: <CE9ECA2A-E64A-464F-A016-16FB524D2DCC@nextthought.com>
References: <CE9ECA2A-E64A-464F-A016-16FB524D2DCC@nextthought.com>
Message-ID: <CAMte6h=j8yA7eDBpF-HMp3sUU__8NMDKHUpHeigGKXtAfw-aTg@mail.gmail.com>

I don't see this relevant message in your references.

https://mail.python.org/pipermail/web-sig/2004-September/000749.html

Perhaps that, and following messages, might shed more light?

On Thu, Mar 24, 2016 at 3:18 PM, Jason Madden <jason.madden at nextthought.com>
wrote:

> Hi all,
>
>
> Is there any practical reason that the type of the `environ` object must
> be exactly `dict`, as specified in PEP3333?
>
> I'm asking because it was recently pointed out that gevent's WSGI server
> can sometimes print `environ` (on certain error cases), but that can lead
> to sensitive information being kept in the server's logs (e.g.,
> HTTP_AUTHORIZATION, HTTP_COOKIE, maybe other things). The simplest and most
> flexible way to prevent this from happening, not just inadvertently within
> gevent itself but also for client applications, I thought, was to have
> `environ` be a subclass of `dict` with a customized `__repr__` (much like
> WebOb does for MultiDict, and repoze.who does for Identity, both for
> similar reasons).
>
> Unfortunately, when I implemented that in [0], I discovered that
> `wsgiref.validator` asserts that type(environ) is dict. I looked up the
> PEP, and sure enough, PEP 3333 states that environ "must be a builtin
> Python dictionary (not a subclass, UserDict or other dictionary
> emulation)." [1]
>
> Background/History
> ==================
>
> That seemed overly restrictive to me, so I tried to backtrack the history
> of that language in hopes of discovering the rationale.
>
> - It was present in the predecessor of PEP 3333, PEP 0333, in the first
> version committed to the repository in August 2004. [2]
> - Prior to that, it was in both drafts of what would become PEP 0333
> posted to this mailing list, again from August 2004: [3], [4].
> - The ancestor of those drafts, the "Python Web Container Interface v1.0"
> was posted in December of 2003 with somewhat less restrictive language:
> "the environ object *must* be a Python dictionary....The rationale for
> requiring a dictionary is to maximize portability
> between containers" [5].
>
> Now, the discussion on that earliest draft in [5] specifically brought up
> using other types that implement all the methods of a dictionary, like
> UserDict.DictMixin [6]. The last post on the subject in that thread seemed
> to be leaning towards accepting non-dict objects, at least if they were
> good enough [7].
>
> By the time the draft became recognizable as the precursor to PEP 0333 in
> [3], the very strict language we have now was in place. That draft,
> however, specifically stated that it was intended to be compatible with
> Python 1.5.2. In Python 1.5.2, it wasn't possible to subclass the builtin
> dict, so imitations, like UserDict.DictMixin, were necessarily imprecise.
> This was later changed to the much-maligned Python 2.2.2 release [8];
> Python 2.2 added the ability to subclass dict, but the language wasn't
> changed.
>
> Today
> =====
>
> Given that today, we can subclass dict with full fidelity, is there still
> any practical reason not to be able to do so? I'm probably OK with gevent
> violating the letter of the spec in this regard, so long as there are no
> practical consequences. I was able to think of two possible objections, but
> both can be solved:
>
> - Pickling the custom `environ` type and then loading it in another
> process might not work if the class is not available. I can imagine this
> coming up with Celery, for example. This is easily fixed by adding an
> appropriate `__reduce_ex__` implementation.
>
> - Code somewhere relies on `if type(some_object) is dict:` (where
> `environ` became `some_object`, presumably through several levels of
> calls), instead of `isinstance(some_object, dict)` or
> `isinstance(some_object, collections.MutableMapping)`. The solution here is
> simply to not do that :) Pylint, among other linters, produces warnings if
> you do.
>
> Can anyone think of any other practical reasons I've overlooked? Is this
> just a horrible idea for other reasons?
>
> I appreciate any discussion!
>
> Thanks,
> Jason
>
> [0] https://github.com/gevent/gevent/compare/secure-environ
> [1] https://www.python.org/dev/peps/pep-3333/#specification-details
> [2]
> https://github.com/python/peps/commit/d5864f018f58a35fa787492e6763e382f98b923c#diff-ff370d50af3db062b015d1ef85935779
> [3] https://mail.python.org/pipermail/web-sig/2004-August/000518.html
> [4] https://mail.python.org/pipermail/web-sig/2004-August/000562.html
> [5] https://mail.python.org/pipermail/web-sig/2003-December/000394.html
> [7] https://mail.python.org/pipermail/web-sig/2003-December/000401.html
> [8] https://mail.python.org/pipermail/web-sig/2004-August/000565.html
>
> _______________________________________________
> Web-SIG mailing list
> Web-SIG at python.org
> Web SIG: http://www.python.org/sigs/web-sig
> Unsubscribe:
> https://mail.python.org/mailman/options/web-sig/alan%40xhaus.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/web-sig/attachments/20160324/daf166d2/attachment.html>

From jason.madden at nextthought.com  Thu Mar 24 12:29:05 2016
From: jason.madden at nextthought.com (Jason Madden)
Date: Thu, 24 Mar 2016 11:29:05 -0500
Subject: [Web-SIG] Any practical reason type(environ) must be dict (not
 subclass)?
In-Reply-To: <CAMte6h=j8yA7eDBpF-HMp3sUU__8NMDKHUpHeigGKXtAfw-aTg@mail.gmail.com>
References: <CE9ECA2A-E64A-464F-A016-16FB524D2DCC@nextthought.com>
 <CAMte6h=j8yA7eDBpF-HMp3sUU__8NMDKHUpHeigGKXtAfw-aTg@mail.gmail.com>
Message-ID: <7670B44D-E963-4D8C-A5E9-E057F4C775BE@nextthought.com>


> On Mar 24, 2016, at 11:09, Alan Kennedy <alan at xhaus.com> wrote:
> 
> I don't see this relevant message in your references.
> 
> https://mail.python.org/pipermail/web-sig/2004-September/000749.html
> 
> Perhaps that, and following messages, might shed more light?

Yes, thank you, I did miss that thread. It does help shed some light on the issue.

The two main arguments made seem to be that:

1) Creating subclasses of builtin objects is difficult and subject to breakage if you try to get too fancy.

That's a fair point, and in the context of when it was written (Python 3.0 was still under discussion) it makes a lot of sense.

2) Middleware or the app can do dict(environ) and lose your subclass.

Also true. But I think it's only particularly relevant if the WSGI implementation itself relies on the subclass to provide essential functionality that the PEP specifies (e.g., decoding bytes-to-str on key access).

It was also mentioned that practicality beats purity and no practical use for a subclass was known.

Well, here's a practical use :) And the two points above do not apply to this practical use, I think. (1) doesn't apply because `__repr__` is not going to change and isn't fancy. (2) doesn't apply because gevent keeps a reference to the environ its creates and passes to the app, so if middleware passes a new dict(environ) on to the app, gevent's own error handling is still secure; consider passing a SecureEnviron to the app a best-effort at secure-by-default---if the user configures their application such that this feature is disabled for part of the stack, that's on the application. No feature of gevent will break, and it's better than not having the option at all IMHO.

Jason

From cory at lukasa.co.uk  Fri Mar 25 06:01:51 2016
From: cory at lukasa.co.uk (Cory Benfield)
Date: Fri, 25 Mar 2016 10:01:51 +0000
Subject: [Web-SIG] Any practical reason type(environ) must be dict (not
 subclass)?
In-Reply-To: <7670B44D-E963-4D8C-A5E9-E057F4C775BE@nextthought.com>
References: <CE9ECA2A-E64A-464F-A016-16FB524D2DCC@nextthought.com>
 <CAMte6h=j8yA7eDBpF-HMp3sUU__8NMDKHUpHeigGKXtAfw-aTg@mail.gmail.com>
 <7670B44D-E963-4D8C-A5E9-E057F4C775BE@nextthought.com>
Message-ID: <70D88569-62CB-4EC8-A467-6890336B0D96@lukasa.co.uk>


> On 24 Mar 2016, at 16:29, Jason Madden <jason.madden at nextthought.com> wrote:
> Well, here's a practical use :) And the two points above do not apply to this practical use, I think. (1) doesn't apply because `__repr__` is not going to change and isn't fancy. (2) doesn't apply because gevent keeps a reference to the environ its creates and passes to the app, so if middleware passes a new dict(environ) on to the app, gevent's own error handling is still secure; consider passing a SecureEnviron to the app a best-effort at secure-by-default---if the user configures their application such that this feature is disabled for part of the stack, that's on the application. No feature of gevent will break, and it's better than not having the option at all IMHO.

Given that gevent is keeping hold of its own reference to the environ, why does gevent not simply wrap the environ dict in a class that implements this functionality directly? In that manner, gevent can expose its own error handling behaviour as desired, and continue to follow PEP-3333.

In fact, I believe this is exactly what PJ was getting at. The ability to subclass the dictionary (in this case, to subclass it with one that hides some keys on printing) is only useful to the entity that does the subclassing, because there is no guarantee that the subclass will not be lost somewhere else in the WSGI stack. However, if subclassing is only useful to you there is another alternative to the problem, which is to compose the environ dict into an object that applies the custom behaviour.

Because of that, I?m disinclined to want to widen the spec here. PJ?s original analysis is right: allowing subclasses does not provide more utility than disallowing them, but it does allow more bugs to creep in due to inconsistent expectations. Better to have an object with a known set of behaviours and have applications/servers wrap it in custom function.

Cory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/web-sig/attachments/20160325/2a633977/attachment-0001.sig>

From jason.madden at nextthought.com  Fri Mar 25 11:04:24 2016
From: jason.madden at nextthought.com (Jason Madden)
Date: Fri, 25 Mar 2016 10:04:24 -0500
Subject: [Web-SIG] Any practical reason type(environ) must be dict (not
 subclass)?
In-Reply-To: <70D88569-62CB-4EC8-A467-6890336B0D96@lukasa.co.uk>
References: <CE9ECA2A-E64A-464F-A016-16FB524D2DCC@nextthought.com>
 <CAMte6h=j8yA7eDBpF-HMp3sUU__8NMDKHUpHeigGKXtAfw-aTg@mail.gmail.com>
 <7670B44D-E963-4D8C-A5E9-E057F4C775BE@nextthought.com>
 <70D88569-62CB-4EC8-A467-6890336B0D96@lukasa.co.uk>
Message-ID: <E934A100-9B92-4E13-BA64-69B798E2EC9F@nextthought.com>


> On Mar 25, 2016, at 05:01, Cory Benfield <cory at lukasa.co.uk> wrote:
> 
> Given that gevent is keeping hold of its own reference to the environ, why does gevent not simply wrap the environ dict in a class that implements this functionality directly? In that manner, gevent can expose its own error handling behaviour as desired, and continue to follow PEP-3333.

I did consider that, but didn't want to do that unless there were actual practical problems passing the same object that gevent references. Making a copy just to pass to the application adds additional time and memory requirements that are always nice to avoid in a server. 

> In fact, I believe this is exactly what PJ was getting at. The ability to subclass the dictionary (in this case, to subclass it with one that hides some keys on printing) is only useful to the entity that does the subclassing, because there is no guarantee that the subclass will not be lost somewhere else in the WSGI stack.

I looked at most of the middleware listed on the WSGI homepage [1], as well as a decent sampling of the packages identified as middleware on PyPI [2]. I didn't find any that passed a new environ on to the next application; they all seem to simply pass on the environ object as given to them. Now that's just a sampling so obviously it doesn't mean that such copying doesn't happen. But doing so eliminates the ability for lower middlewares to communicate with upper middlewares through the environ if they are more than one layer separated---a real-world example is setting `paste.expected_exceptions`---so practically speaking, I imagine it's quite rare.

> Because of that, I?m disinclined to want to widen the spec here. PJ?s original analysis is right: allowing subclasses does not provide more utility than disallowing them, but it does allow more bugs to creep in due to inconsistent expectations. Better to have an object with a known set of behaviours and have applications/servers wrap it in custom function.

I'm not sure I agree with that, but I can see the argument.

I started out by asking if there were any *practical* reasons not to pass a tiny dict subclass as environ, and when I was surveying existing middleware for this thread, I found a big reason: it turns out that WebOb's Request object *also* verifies that type(environ) is dict [3]. Given the popularity of WebOb and its derivatives like Pyramid, this is not a change gevent can make. We'll take a different approach.

Thanks again for all the great insights and discussion!

Jason


[1] http://wsgi.readthedocs.org/en/latest/libraries.html
[2] https://pypi.python.org/pypi?:action=browse&show=all&c=319&c=326&c=506&c=509
[3] https://github.com/Pylons/webob/blob/master/webob/request.py#L112

From cory at lukasa.co.uk  Fri Mar 25 13:23:35 2016
From: cory at lukasa.co.uk (Cory Benfield)
Date: Fri, 25 Mar 2016 17:23:35 +0000
Subject: [Web-SIG] Any practical reason type(environ) must be dict (not
 subclass)?
In-Reply-To: <E934A100-9B92-4E13-BA64-69B798E2EC9F@nextthought.com>
References: <CE9ECA2A-E64A-464F-A016-16FB524D2DCC@nextthought.com>
 <CAMte6h=j8yA7eDBpF-HMp3sUU__8NMDKHUpHeigGKXtAfw-aTg@mail.gmail.com>
 <7670B44D-E963-4D8C-A5E9-E057F4C775BE@nextthought.com>
 <70D88569-62CB-4EC8-A467-6890336B0D96@lukasa.co.uk>
 <E934A100-9B92-4E13-BA64-69B798E2EC9F@nextthought.com>
Message-ID: <EC3EDEFD-E1F3-4BC6-AA83-6A6508045DC1@lukasa.co.uk>


> On 25 Mar 2016, at 15:04, Jason Madden <jason.madden at nextthought.com> wrote:
> 
> 
>> On Mar 25, 2016, at 05:01, Cory Benfield <cory at lukasa.co.uk> wrote:
>> 
>> Given that gevent is keeping hold of its own reference to the environ, why does gevent not simply wrap the environ dict in a class that implements this functionality directly? In that manner, gevent can expose its own error handling behaviour as desired, and continue to follow PEP-3333.
> 
> I did consider that, but didn't want to do that unless there were actual practical problems passing the same object that gevent references. Making a copy just to pass to the application adds additional time and memory requirements that are always nice to avoid in a server.

For what it?s worth, I?m not advocating a copy. I?m advocating a class like this:


class SecureDictWrapper(collections.MutableMapping):
    def __init__(self, environ):
        self._environ = environ

That class would then implement the MutableMapping API and delegate its calls through to the dictionary itself. There would still only be one dictionary: the only new allocation is for the wrapper class. The overhead is small. =)

Cory

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/web-sig/attachments/20160325/fddf8542/attachment.sig>