Strange problems with encoding

Martin v. Löwis martin at v.loewis.de
Thu Nov 6 15:09:04 EST 2003


Rudy Schockaert <rudy.schockaert at pandoraSTOPSPAM.be> writes:

> I wasn't even aware there are two camps. What would be the reasons not
> to use setdefaultencoding? 

You lose portability (more correctly: you get a false sense of
portability). If you have write an application that requires the
default encoding to be FOO-1, the application may work fine on system
A, and fail on system B. Telling the operator of system B to change
her default encoding may cause breakage of a different application on
system B, as B has BAR-2 as the default encoding; changing it to FOO-1
would break applications that require it to be BAR-2.

IOW, if you require conversions between Unicode and byte strings,
explicitly do them in your code. Explicit is better than implicit.

> As I configured it now it uses the systems locale to set the
> encoding. I'm using the same machine to retrieve data, manipulate it
> and store in a database (on the same machine).  I would like to
> understand what could be wrong in this case.

If the next user logs in on the same system, and has a different
locale set, that user will misinterpret the data you have created.

> What I mean is that I encode the data when I store it in the DB and
> decode it when I retrieve the data from the DB. I do this because
> SQLObject doesn't support the binary data. As long as the result that
> comes back out is exactly the same as it was when it went in, I don't
> care.

Then you should *define* an encoding that your application uses,
e.g. UTF-8, and use that encoding throughout whereever required,
instead of having the administrator to ask to change a system setting.

Regards,
Martin




More information about the Python-list mailing list