[Python-Dev] Revised PEP 349: Allow str() to return unicode strings

Neil Schemenauer nas at arctrix.com
Tue Aug 23 18:54:09 CEST 2005


On Tue, Aug 23, 2005 at 11:43:02AM -0400, Phillip J. Eby wrote:
> At 09:21 AM 8/23/2005 -0600, Neil Schemenauer wrote:
> >> then of course, one could change ``unicode.__str__()`` to return
> >> ``self``, itself, which should work. but then, why so complicated?
> >
> >I think that may be the right fix.
> 
> No, it isn't.  Right now str(u"x") coerces the unicode object to a
> string, so changing this will be backwards-incompatible with any
> existing programs.

I meant that for the implementation of the PEP, changing
unicode.__str__ to return self seems to be the right fix.  Whether
you believe that str() should be allowed to return unicode instances
is a different question.

> I think the new builtin is actually the right way to go for both 2.x and 
> 3.x Pythons.  i.e., text() would be a builtin in 2.x, along with a new 
> bytes() type, and in 3.x text() could replace the basestring, str and 
> unicode types.

Perhaps the critical question is what will the string type in P3k be
called?  If it will be 'str' then I think the PEP makes sense.  If
it will be something else, then there should be a corresponding type
slot (e.g. __text__).  What method does your proposed text()
built-in call?

> I also think that the text() constructor should have a signature of 
> 'text(ob,encoding="ascii")'.

I think that's a bad idea.  We want to get away from ASCII and use
Unicode instead.

> In the default case, strings can be returned by text() as long as
> they are pure ASCII (making the code str-stable *and*
> unicode-safe).

I think you misunderstand the PEP.  Your proposed function is
neither Unicode-safe nor str-stable, the worst of both worlds.
Passing it a unicode string that contains non-ASCII characters would
result in an exception (not Unicode-safe).  Passing it a str results
in a unicode return value (not str-stable).

  Neil


More information about the Python-Dev mailing list