UnicodeDecodeError, how to elegantly deal with this?

Jorgen Bodde jorgen.maillist at gmail.com
Tue Aug 5 06:37:21 EDT 2008


Hi John,

> If you don't want to be bothered with "unicode problems":
> (1) Don't create a "unicode problem" when one doesn't exist.
> (2) Don't bother other people with *your* "unicode problems".

Well I guess you misunderstood what I meant. I meant I am a simple
developer, getting a string from the file system that happens to be in
some kind of encoding. It is totally a mystery to me why it crashes on
that so that is what I meant with not wanted to be bothered with it,
because I don't see any obvious reason why, not that I am too lazy to
deal with it, it simply seems strange to me.

> In this case, less is more; remove the u prefix in the line
>    filemask = u"%file%"

Ok thanks. I thought making it unicode because it is a search string
that is used in a UTF-8 encoded replacement, would solve it,

> Long Path:
> Ignorance is not bliss. Lose the attitude. Unicode is your friend, not
> an instrument of Satan. Read this:
>    http://www.amk.ca/python/howto/unicode

I never said that I have an attitude towards unicode, I simply
misunderstood it's inner workings. Thanks for the link I will look at
it.

ps. sorry for the direct mail, I can't get used to one mailinglist
always replying to the list, and the other replying to the user by
default ;-)

With regards,
- Jorgen

On Tue, Aug 5, 2008 at 11:00 AM, John Machin <sjmachin at lexicon.net> wrote:
> On Aug 5, 4:23 am, "Jorgen Bodde" <jorgen.maill... at gmail.com> wrote:
>> Hi All,
>>
>> I am relatively new to python unicode pains and I would like to have
>> some advice. I have this snippet of code:
>
>>         thefile = args["file"]
>>         filemask = u"%file%"
>>         therep = arg.replace(filemask, thefile)       ##### error here
>
>
>> It crashes on this:
>>
>> 20:03:49:   File
>> "D:\backup\important\src\airs\webserver\webdispatch.py", line 117, in
>> playFile     therep = arg.replace(filemask, thefile)
>>
>> 20:03:49: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in
>> position 93: ordinal not in range(128)
>>
>> 20:03:49: Unhandled Error: <type 'exceptions.UnicodeDecodeError'>:
>> 'ascii' codec can't decode byte 0xc2 in position 93: ordinal not in
>> range(128)
>>
>> It chokes on a ` character in a file name. I read this file from disk,
>> and I would like to play it. However in the replace action it cannot
>> translate this character. How can I transparently deal with this issue
>> because in my eyes it is simply replacing a string with a string, and
>> I do not want to be bothered with unicode problems. I am not sure in
>> which encoding it is in, but I am not experienced enough to see how I
>> can solve this
>
> If you don't want to be bothered with "unicode problems":
> (1) Don't create a "unicode problem" when one doesn't exist.
> (2) Don't bother other people with *your* "unicode problems".
>
>>
>> Can anybody guide me to an elegant solution?
>>
>
> Short path:
> In this case, less is more; remove the u prefix in the line
>    filemask = u"%file%"
>
> Long Path:
> Ignorance is not bliss. Lose the attitude. Unicode is your friend, not
> an instrument of Satan. Read this:
>    http://www.amk.ca/python/howto/unicode
>
> By the way, how one's filesystem encodes file names can be a good
> thing to know; in your case it appears to be UTF-8.
>
> HTH,
> John
> --
> http://mail.python.org/mailman/listinfo/python-list
>



More information about the Python-list mailing list