[Python-Dev] Revised PEP 349: Allow str() to return unicodestrings

Michael Chermside mcherm at mcherm.com
Tue Aug 23 18:11:02 CEST 2005


Neil Schemenauer wrote:
> The PEP has been rewritten based on a suggestion by Guido to change
> str() rather than adding a new built-in function.  Based on my testing, I
> believe the idea is feasible.

Fredrik Lundh replies:
> note that this breaks chapter 3 of the tutorial:
>
> http://docs.python.org/tut/node5.html#SECTION005130000000000000000
>
> where str() is first introduced.

It's hardly "introduced"... the only bit I found reads:

   ... When a Unicode string is printed, written to a file, or converted
   with str(), conversion takes place using this default encoding.

   >>> u"abc"
   u'abc'
   >>> str(u"abc")
   'abc'
   >>> u"äöü"
   u'\xe4\xf6\xfc'
   >>> str(u"äöü")
   Traceback (most recent call last):
     File "<stdin>", line 1, in ?
   UnicodeEncodeError: 'ascii' codec can't encode characters in position
   0-2: ordinal not in range(128)

   To convert a Unicode string into an 8-bit string using a specific encoding,
   Unicode objects provide an encode() method that takes one argument, the
   name of the encoding. Lowercase names for encodings are preferred.

   >>> u"äöü".encode('utf-8')
   '\xc3\xa4\xc3\xb6\xc3\xbc'

I think that if we just took out the example of str() usage and replaced
it with a sentence or two that DID introduce the (revised) str() function,
it ought to work. In particular, it could mention that you can call str()
on any object, which isn't stated here at all.

-- Michael Chermside




More information about the Python-Dev mailing list