[Python-Dev] readd u'' literal support in 3.3?

Barry Warsaw barry at python.org
Fri Dec 9 04:34:08 CET 2011


On Dec 09, 2011, at 03:50 AM, Lennart Regebro wrote:

>One reason is that you need to be able to say "This should be str in
>Python 2, and binary in Python 3, that should be Unicode in Python 2
>and str in Python 3, and that over there should be str in both
>versions", and the future import doesn't support that.

Sorry, I don't understand this.  What does it mean to be "str in both
versions"?  And why would you want that?

As for "str in Python 2 and binary in Python 3", b'' prefixes do that in
Python >= 2.6 without the future import (if I take "binary" to mean bytes
type).

As for "Unicode in Python 2 and str in Python 3", unadorned strings with the
future import in Python >= 2.6 does that just fine.

One of the nice things too is that with #include <bytesobject.h> in Python >=
2.6, changing all your PyStrings to PyBytes, you can get the same behavior in
your extension modules.

You still need to be clear about what are bytes and what are strings.  The
problem comes when you aren't or can't be sure, i.e. you have objects that are
sometimes one and sometimes the other.  Such as email headers.  In that case,
you're kind of screwed.  Python 2's str type let you cheat, but not without
consequences.  Those consequences are spelled "UnicodeErrors" and I'll be glad
to be rid of them.

Cheers,
-Barry


More information about the Python-Dev mailing list