str() should convert ANY object to a string without EXCEPTIONS !

Lie Lie.1296 at gmail.com
Sun Sep 28 07:04:10 EDT 2008


On Sep 28, 4:21 pm, est <electronix... at gmail.com> wrote:
> On Sep 28, 4:38 pm, Steven D'Aprano <st... at REMOVE-THIS-
>
>
>
> cybersource.com.au> wrote:
> > On Sat, 27 Sep 2008 22:37:09 -0700, est wrote:
> > >>>> str(u'\ue863')
> > > Traceback (most recent call last):
> > >   File "<stdin>", line 1, in <module>
> > > UnicodeEncodeError: 'ascii' codec can't encode character u'\ue863' in
> > > position 0
> > > : ordinal not in range(128)
>
> > > FAIL.
>
> > What result did you expect?
>
> > [...]
>
> > > The problem is, why the f**k set ASCII encoding to range(128) ????????
> > > while str() is internally byte array it should be handled in range(256)
> > > !!!!!!!!!!
>
> > To quote Terry Pratchett:
>
> >     "What sort of person," said Salzella patiently, "sits down and
> >     *writes* a maniacal laugh? And all those exclamation marks, you
> >     notice? Five? A sure sign of someone who wears his underpants
> >     on his head." -- (Terry Pratchett, Maskerade)
>
> > In any case, even if the ASCII encoding used all 256 possible bytes, you
> > still have a problem. Your unicode string is a single character with
> > ordinal value 59491:
>
> > >>> ord(u'\ue863')
>
> > 59491
>
> > You can't fit 59491 (or more) characters into 256, so obviously some
> > unicode chars aren't going to fit into ASCII without some sort of
> > encoding. You show that yourself:
>
> > u'\ue863'.encode('mbcs')  # Windows only
>
> > But of course 'mbcs' is only one possible encoding. There are others.
> > Python refuses to guess which encoding you want. Here's another:
>
> > u'\ue863'.encode('utf-8')
>
> > --
> > Steven
>
> OK, I am tired of arguing these things since python 3.0 fixed it
> somehow.

I'm against calling python 3.0 fixed it, python 3.0's default encoding
is utf-8/Unicode, and that is why your problem magically disappears.

> Can anyone tell me how to customize a default encoding, let's say
> 'ansi' which handles range(256) ?

Python used to have sys.setdefaultencoding, but that feature was an
accident. sys.setdefaultencoding was intended to be used for testing
purpose when the developers haven't decided what to use as default
encoding (what use is default when you can change it).
sys.setdefaultencoding has been removed, programmers should encode
characters manually if they want to use something other than the
default encoding (ASCII).



More information about the Python-list mailing list