unicode to ascii converting
Peter Wilkinson
pwilkinson at videotron.ca
Fri Aug 6 15:35:31 EDT 2004
Well this is interestingly annoying:
u"ä".encode("ascii", "ignore") -> '' # works just fine but as I have
written
aa = "ä"
aa.encode("ascii","ignore") ->
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 0:
ordinal not in range(128)
So I am guessing that I don't understand something about the syntax
Peter
At 02:31 PM 8/6/2004, Bernhard Herzog wrote:
>Peter Wilkinson <pwilkinson at videotron.ca> writes:
>
> > It would be good to find out _why_ this happens in the first place. I
> > will keep do a little searching on this for a few days.
>
>Most likely because you have characters in that file that are not in the
>ASCII character set. ASCII is after all only a very small subset of
>unicode. E.g.
>
> >>> u"ä".encode("ascii")
>Traceback (most recent call last):
> File "<stdin>", line 1, in ?
>UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in
>position 0: ordinal not in range(128)
>
>
>If it's OK to lose information, you could use the error argument to
>.encode like
>
> >>> u"ä".encode("ascii", "ignore")
>''
>
>or
>
> >>> u"ä".encode("ascii", "replace")
>'?'
>
>
> Bernhard
>
>--
>Intevation GmbH http://intevation.de/
>Skencil http://sketch.sourceforge.net/
>Thuban http://thuban.intevation.org/
>--
>http://mail.python.org/mailman/listinfo/python-list
More information about the Python-list
mailing list