unicode to ascii converting

Bernhard Herzog bh at intevation.de
Fri Aug 6 14:31:04 EDT 2004


Peter Wilkinson <pwilkinson at videotron.ca> writes:

> It would be good to find out _why_ this happens in the first place. I
> will keep do a little searching on this for a few days.

Most likely because you have characters in that file that are not in the
ASCII character set. ASCII is after all only a very small subset of
unicode.  E.g.

>>> u"ä".encode("ascii")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 0: ordinal not in range(128)


If it's OK to lose information, you could use the error argument to
.encode like

>>> u"ä".encode("ascii", "ignore")
''

or

>>> u"ä".encode("ascii", "replace")
'?'


   Bernhard

-- 
Intevation GmbH                                 http://intevation.de/
Skencil                                http://sketch.sourceforge.net/
Thuban                                  http://thuban.intevation.org/



More information about the Python-list mailing list