[Tutor] ascii codec cannot encode character
Peter Otten
__peter__ at web.de
Fri Jan 28 09:43:55 CET 2011
Alex Hall wrote:
> Hello again:
> I have never seen this message before. I am pulling xml from a site's
> api and printing it, testing the wrapper I am writing for the api. I
> have never seen this error until just now, in the twelfth result of my
> search:
> UnicodeEncodeError: 'ASCII' codec can't encode character u'\u2019' in
> position 42: ordinal not in range(128)
>
> I tried making the strings Unicode by saying something like
> self.title=unicode(data.find("title").text)
> but the same error appeared. I found the manual chapter on this, but I
> am not sure I want to ignore since I do not know what this character
> (or others) might mean in the string. I am not clear on what 'replace'
> will do. Any suggestions?
You get a UnicodeEncodeError if you print a unicode string containing non-
ascii characters, and Python cannot determine the target's encoding:
$ cat tmp.py
# -*- coding: utf-8 -*-
print u'äöü'
$ python tmp.py
äöü
$ python tmp.py > tmp.txt
Traceback (most recent call last):
File "tmp.py", line 2, in <module>
print u'äöü'
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2:
ordinal not in range(128)
The error occurs because by default Python 2 tries to convert unicode into
bytes using the ascii codec.
One approach to tackle this is to check sys.stdout's encoding, and if it's
unknown (None) wrap it into a codecs.Writer that can handle all characters
that may occur. UTF-8 is usually a good choice, but other codecs are
possible.
$ cat tmp2.py
# -*- coding: utf-8 -*-
import sys
if sys.stdout.encoding is None:
import codecs
Writer = codecs.getwriter("utf-8")
sys.stdout = Writer(sys.stdout)
print u'äöü'
$ python tmp2.py
äöü
$ python tmp2.py > tmp.txt
$ cat tmp.txt
äöü
More information about the Tutor
mailing list