[Tutor] ASCII and Unicode

Willi Richert w.richert@gmx.net
Tue Nov 5 09:12:02 2002


On Thursday 24 October 2002 20:37, Jens Kubieziel wrote:
> On Thu, Oct 24, 2002 at 05:46:18PM +0200, Fran=E7ois Granger wrote:
> > on 24/10/02 15:59, Jens Kubieziel at maillist@kuwest.de wrote:
> > > I'm trying to get through the Python Tutorial and have some Problem=
s
> > > with Unicode-Strings. It is suggested to use
> > >
> > >>>> u"=E4=F6=FC"
> > >
> > > to print out Unicode-values. IDLE says here
> > >    UnicodeError: ASCII enconding error: ordinal not in range (128)
> > > I'm working with Python 2.1.3 and IDLE 0.8. How can I solve this?
> >
> > Does this help ?
> >
> > http://www.reportlab.com/i18n/python_unicode_tutorial.html
>
> Nope, not. I does only describe the Win way. I work with a Debian
> Linux box.

If you print this unicode string Python tries to convert it to Ascii beca=
use
in your site.py there the defaultencoding set to ascii. Unfortunatly, the

Umlaut-chars are saved after the 127 ascii chars in your char table:
>>> ord("=E4"), ord("=F6"), ord("=FC")

(228, 246, 252)

So, Python does not know how to convert them from Unicode to feed them to

print:
>>> print u"=F6=E4=FC"

Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: ASCII encoding error: ordinal not in range(128)

Now you have two possibilities:

1) Convert them explicitly on every use:
>>> print u"=F6=E4=FC".encode("iso8859-15")

=F6=E4=FC

This is very awkward. So there is another possibility:

2) Change your site.py (on my host: /usr/lib/python/site.py):
from
encoding =3D "ascii"
to
encoding =3D "iso8859-15"

Here you can get problems if there are other people sharing this Python
environment with you and expect ascii to be the default encoding.

Unfortunately, somebody of the Python developer thought it might be "usef=
ul"
to delete the function sys.setdefaultencoding() with which you could
otherwise set the default encoding individually in your own app without
having to change your Python core files.

To this there is one workaround. Because sitecustomize is imported before
setdefaultencoding() is deleted you can add at the end of that file (crea=
te
it if not existant):

import sys
setdefaultencoding =3D sys.setdefaultencoding

Now you can individually change your default encoding in every app:

import sitecustomize
import sys
sitecustomize.setdefaultencoding("iso8859-15")

That's it.

Have fun,
wr

PS: Does anybody know why sys.setdefaultencoding is deleted at the end of
site.py?