[Tutor] UnicodeDecodeError

Kent Johnson kent37 at tds.net
Wed Feb 23 13:21:40 CET 2005


Michael Lange wrote:
> Hello list,
> 
> I've encountered an (at least for me) weird error in the project I'm working on (see the traceback below).
> Unfortunately there are several of my application's modules involved, so I cannot post all of my code here.
> I hope I can explain in plain words what I'm doing.
> 
> The line in the traceback that seems to cause the problems:
> 
>     if self.nextfile == _('No destination file selected'):
> 
> "self.nextfile" is a variable that contains either the path to a file (the destination file for sound recording)
> or the gettext string you see above.
> For some reason I get the error below when "self.nextfile" contains a special character *and* the gettext string
> holds the german translation, which contains a special character, too (of course '\xe4' as 23rd character).
> It looks like there are no problems when I remove the german translation or when there are no special characters
> in the filename, but as soon as I have special characters on both sides of the equation the error occurs.

This is a part of Python that still confuses me. I think what is happening is
- self.nextfile is a Unicode string sometimes (when it includes special characters)
- the gettext string is a byte string
- to compare the two, the byte string is promoted to Unicode by decoding it with the system default 
encoding, which is generally 'ascii'.
- the gettext string includes non-ascii characters and the codec raises an exception.

I don't know what the best solution is. Two possibilities (substitute your favorite encoding for 
latin-1):
- decode the gettext string, e.g.
   if self.nextfile == _('No destination file selected').decode('latin-1'):

- set your default encoding to latin-1. (This solution is frowned on by the Python-Unicode 
cognoscenti and it makes your programs non-portable). Do this by creating a file 
site-packages/sitecustomize.py containing the lines
import sys
sys.setdefaultencoding('latin-1')

Kent

> 
> ######################################################################
> Error: 1
> UnicodeDecodeError Exception in Tk callback
>   Function: <bound method Snackrecorder.start of <snackrecorder.Snackrecorder instance at 0xb77fe24c>> (type: <type 'instancemethod'>)
>   Args: ()
> Traceback (innermost last):
>   File "/usr/lib/python2.3/site-packages/Pmw/Pmw_1_2/lib/PmwBase.py", line 1747, in __call__
>     return apply(self.func, args)
>   File "/usr/local/share/phonoripper/snackrecorder.py", line 305, in start
>     if self.nextfile == _('No destination file selected'):
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 22: ordinal not in range(128)
> 
> ######################################################################
> 
> I've looked into PmwBase.py, but I couldn't figure out what's going on, so I hope that someone
> here can give me a hint.
> 
> Thanks in advance and best regards
> 
> Michael
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
> 




More information about the Tutor mailing list