[Python-Dev] readd u'' literal support in 3.3?

Lennart Regebro regebro at gmail.com
Fri Dec 9 15:18:33 CET 2011


On Fri, Dec 9, 2011 at 04:34, Barry Warsaw <barry at python.org> wrote:
> Sorry, I don't understand this.  What does it mean to be "str in both
> versions"?  And why would you want that?

It means that it's a str, that is a string of bytes, in Python 2, and
a str, that is a string of Unicode characters, in Python 3. There are
cases where you want this, for example not all libraries will accept
both str and Unicode under Python 2.

> As for "Unicode in Python 2 and str in Python 3", unadorned strings with the
> future import in Python >= 2.6 does that just fine.

Yes, but the future import will change this for *all* strings, making
it impossible to have a string that is a "str" in both Python 2 and
Python 3. For that reason, the future import is not enough as a
solution (and I suspect, one major reason why I haven't actually seen
any one using it).

For most cases, using something like six's b() and u() has turned out
to be a better solution. It's uglier than having u'' support in Python
3, but has the benefit that b() works also in Python 2.5.

> The
> problem comes when you aren't or can't be sure, i.e. you have objects that are
> sometimes one and sometimes the other.  Such as email headers.  In that case,
> you're kind of screwed.  Python 2's str type let you cheat, but not without
> consequences.  Those consequences are spelled "UnicodeErrors" and I'll be glad
> to be rid of them.

Me too.


More information about the Python-Dev mailing list