[Web-SIG] WSGI 2

Tres Seaver tseaver at palladion.com
Tue Aug 4 19:41:52 CEST 2009


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jim Fulton wrote:
> On Tue, Aug 4, 2009 at 12:05 PM, P.J. Eby<pje at telecommunity.com> wrote:
>> At 10:44 PM 8/4/2009 +1000, Graham Dumpleton wrote:
>>> In summary, what are the practical uses cases that would make passing
>>> bytes over UTF-8 or even latin-1 worthwhile?
>> My concern at this point is a nagging feeling that we are abandoning
>> WSGI<->HTTP equivalence for convenience in the face of changes in Python's
>> defaults.  Had Python 3 been the standard version in existence when WSGI 1
>> was created, I would've argued for making *everything* bytes, in order to:
>>
>> 1. Force all encodings to be explicit, and
>> 2. Ensure WSGI<->HTTP equivalence (i.e., WSGI==HTTP encoded in Python
>> objects)
>>
>> And this is why the original spec said that Unicode strings should be
>> treated as bytes -- because byte strings were always the original target of
>> the spec.
>>
>> Please remember that WSGI is not primarily intended to provide application
>> developers with a convenient API; its first and most important job is to
>> ship the data around without mangling it in the process.
>>
>> HTTP moves bytes, therefore WSGI should move bytes.  For practical reasons,
>> it would be good to *also* support strings on the application side,
>> especially for application migration.  However, I see no reason to make
>> *servers* provide decoded strings instead of bytes.
> 
> +1
> 
> I haven't had enough time to follow this and earlier encoding
> discussions and so haven't commented up to now, but I've always been
> uncomfortable with WSGI using anything but bytes or assuming any
> encoding.  I agree that application frameworks should deal with
> conversion between bytes and unicode.

+1 from me as well.  The fact that Python3 now calls 'string' what used
to be 'unicode' doesn't change the fact that "transport-level"
operations have to be done in bytes.  It should be the framework /
application's job to handle conversion of byte inputs from the request
onto strings, and string response fields onto bytes:  ideally, the
framework will do this in a way which keeps the application writer
blissfully ignorant of the distinction.

Note that I think Python3 gets the os.evniron bit wrong for exactly the
same reasons:  I think anybody wanting to use the
environment-as-provided-by-the-OS should deal in bytes (or whatever the
OS provides), with a convenience wrapper for those who don't care about
the difference.  I lost that argument, but that doesn't mean I was wrong. :)


Tres.
- --
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFKeHLg+gerLs4ltQ4RAiFjAJ9uZIkfxwh5w1aYiEdIpr+2yQ+iBwCeJiFM
eUfWBoPwyzwHThkMwd24SZE=
=lod9
-----END PGP SIGNATURE-----



More information about the Web-SIG mailing list