Unicode problems, yet again

"Martin v. Löwis" martin at v.loewis.de
Sun Apr 24 12:18:34 EDT 2005


Ivan Voras wrote:
> Sorry, that was just steam venting from my ears - I often get bitten by
> the "ordinal not in range(128)" error. :)

I think I'm glad to hear that. Errors should never pass silently, unless
explicitly silenced. When you get that error, it means there is a bug in
your code (just like a ValueError, a TypeError, or an IndexError). The
best way to deal with them is to fix them.

Now, the troubling part is clearly that you are getting *bitten* by
this specific error, and often so. I presume you get other kinds of
errors also often, but they don't bite :-) This suggests that you should
really try to understand what the error message is trying to tell so,
and what precisely the underlying error is.

For other errors, you have already come to an understanding what they
mean: NameError, ah, there must be a typo. AttributeError on None, ah,
forgot to check for a None result somewhere. ordinal not in range(128),
hmm, let's try different variations of the code and see which ones
work. This is going to continue biting you until you really understand
what it means.

The most "sane" mental model (and architecture) is one where you always
have Unicode strings in your code, and decode/encode only at system
interfaces (sockets, databases, ...). It turns out that the database
you use already follows this strategy (i.e. it decodes for you), so
you now only need to design the other interfaces so it is clear when
you have Unicode characters and when you have bytes.

Regards,
Martin



More information about the Python-list mailing list