[Python-Dev] Revised PEP 349: Allow str() to return unicode strings
Phillip J. Eby
pje at telecommunity.com
Tue Aug 23 17:43:02 CEST 2005
At 09:21 AM 8/23/2005 -0600, Neil Schemenauer wrote:
> > then of course, one could change ``unicode.__str__()`` to return
> > ``self``, itself, which should work. but then, why so complicated?
>
>I think that may be the right fix.
No, it isn't. Right now str(u"x") coerces the unicode object to a string,
so changing this will be backwards-incompatible with any existing programs.
I think the new builtin is actually the right way to go for both 2.x and
3.x Pythons. i.e., text() would be a builtin in 2.x, along with a new
bytes() type, and in 3.x text() could replace the basestring, str and
unicode types.
I also think that the text() constructor should have a signature of
'text(ob,encoding="ascii")'. In the default case, strings can be returned
by text() as long as they are pure ASCII (making the code str-stable *and*
unicode-safe). In the non-default case, a unicode object should always be
returned, making the code unicode-safe but not str-stable. Allowing text()
to return 8-bit strings would be an obvious violation of its name: it's for
text, not bytes.
More information about the Python-Dev
mailing list