[Python-Dev] thoughts on the bytes/string discussion

Fri Jun 25 00:01:46 CEST 2010

On Thu, Jun 24, 2010 at 2:44 PM, Ian Bicking <ianb at colorstudy.com> wrote:
> I think we'll avoid a lot of the confusion that was present with Python 2 by
> not making the coercions transitive.  For instance, here's something that
> would work in Python 2:
>
>   urlunsplit(('http', 'example.com', '/foo', u'bar=baz', ''))
>
> And you'd get out a unicode string, except that would break the first time
> that query string (u'bar=baz') was not ASCII (but not until then!)

Actually, that wouldn't be a problem. The problem would be this:

   urlunsplit(('http', 'example.com', u'/foo', 'bar=baz', ''))

(I moved the "u" prefix from bar=baz to /foo.) And this would break
when instead of baz there was some non-ASCII UTF-8, e.g.

urlunsplit(('http', 'example.com', u'/foo', 'bar=\xe1\x88\xb4', ''))
-- 
--Guido van Rossum (python.org/~guido)