[Python-Dev] Revised PEP 349: Allow str() to return unicode strings
Neil Schemenauer
nas at arctrix.com
Tue Aug 23 18:54:09 CEST 2005
On Tue, Aug 23, 2005 at 11:43:02AM -0400, Phillip J. Eby wrote:
> At 09:21 AM 8/23/2005 -0600, Neil Schemenauer wrote:
> >> then of course, one could change ``unicode.__str__()`` to return
> >> ``self``, itself, which should work. but then, why so complicated?
> >
> >I think that may be the right fix.
>
> No, it isn't. Right now str(u"x") coerces the unicode object to a
> string, so changing this will be backwards-incompatible with any
> existing programs.
I meant that for the implementation of the PEP, changing
unicode.__str__ to return self seems to be the right fix. Whether
you believe that str() should be allowed to return unicode instances
is a different question.
> I think the new builtin is actually the right way to go for both 2.x and
> 3.x Pythons. i.e., text() would be a builtin in 2.x, along with a new
> bytes() type, and in 3.x text() could replace the basestring, str and
> unicode types.
Perhaps the critical question is what will the string type in P3k be
called? If it will be 'str' then I think the PEP makes sense. If
it will be something else, then there should be a corresponding type
slot (e.g. __text__). What method does your proposed text()
built-in call?
> I also think that the text() constructor should have a signature of
> 'text(ob,encoding="ascii")'.
I think that's a bad idea. We want to get away from ASCII and use
Unicode instead.
> In the default case, strings can be returned by text() as long as
> they are pure ASCII (making the code str-stable *and*
> unicode-safe).
I think you misunderstand the PEP. Your proposed function is
neither Unicode-safe nor str-stable, the worst of both worlds.
Passing it a unicode string that contains non-ASCII characters would
result in an exception (not Unicode-safe). Passing it a str results
in a unicode return value (not str-stable).
Neil
More information about the Python-Dev
mailing list