str() should convert ANY object to a string without EXCEPTIONS !

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Sun Sep 28 04:38:00 EDT 2008


On Sat, 27 Sep 2008 22:37:09 -0700, est wrote:

>>>> str(u'\ue863')
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> UnicodeEncodeError: 'ascii' codec can't encode character u'\ue863' in
> position 0
> : ordinal not in range(128)
> 
> FAIL.

What result did you expect?


[...]
> The problem is, why the f**k set ASCII encoding to range(128) ????????
> while str() is internally byte array it should be handled in range(256)
> !!!!!!!!!!


To quote Terry Pratchett:

    "What sort of person," said Salzella patiently, "sits down and
    *writes* a maniacal laugh? And all those exclamation marks, you
    notice? Five? A sure sign of someone who wears his underpants
    on his head." -- (Terry Pratchett, Maskerade)



In any case, even if the ASCII encoding used all 256 possible bytes, you 
still have a problem. Your unicode string is a single character with 
ordinal value 59491:

>>> ord(u'\ue863')
59491

You can't fit 59491 (or more) characters into 256, so obviously some 
unicode chars aren't going to fit into ASCII without some sort of 
encoding. You show that yourself:

u'\ue863'.encode('mbcs')  # Windows only

But of course 'mbcs' is only one possible encoding. There are others. 
Python refuses to guess which encoding you want. Here's another:

u'\ue863'.encode('utf-8')




-- 
Steven



More information about the Python-list mailing list