[Mailman-i18n] sync_members, list_members problem

Martin v. Löwis loewis at informatik.hu-berlin.de
Mon Dec 2 17:01:55 2002


>I've implemented a fix to first sys.getdefaultencoding() and then
>actually print ustr.encode(enc, 'replace') to stdout.  That fixed the
>problem for me, but is it the right solution?

If this is for printing to the terminal, using 'replace' seems fine. 
You could try to find out the user's preferred encoding, and use that
instead of the system encoding, but that is difficult to do, and 
perhaps overkill.

>On a related note, and I know this has been discussed on python-dev,
>is it really not possible to call sys.setdefaultencoding() anywhere in
>Python except by hacking site.py? 

Correct.

>This seems quite shortsighted (maybe I should raise this on
>python-dev).  

I believe this has proven to be quite foreseeing already.

>Say my terminal can handle iso-8859-1.  I'd like to be
>able to set the default encoding to that somewhere that user-specific,
>say PYTHONSTARTUP if I'm running interactively.  But that seems
>impossible

It's not impossible. Just be explicit about your encodings, and
use .encode whenever you want to convert Unicode strings to byte
strings. Explicit is better than implicit.

If you know that *your terminal* is Latin-1, and you want to output
Unicode to the terminal, just do

sys.stdout = codecs.getwriter("iso-8859-1")(sys.stdout)

With that, you get transparent conversion to Latin-1 when printing
to the terminal, without having to touch the default encoding.

There is a patch python.org/sf/612627 which does this automatically, 
ie. it allows to print all Unicode that the terminal supports.

>and I don't think hacking site.py is the appropriate
>response.  What if someone else at my site uses a different terminal
>that can't print iso-8859-1?

Correct. You should never change the default encoding; I consider it
a flaw that you can. Even within a single process, changing the
default encoding is wrong: What if you are printing both to stdout,
and to a XML file, which has its standard UTF-8 encoding? The XML
file will become silently wrong, as you have changed the default 
encoding. Likewise, if you write to a socket, you might silently
write incorrect data, as the wire protocol surely uses an encoding
different from the default encoding.

So that you get ASCII enoding errors is a good thing; it tells 
you that you forgot to use a proper StreamWriter, or forgot to invoke
.encode explicitly.

HTH,
Martin




===================================================================
EASY and FREE access to your email anywhere: http://Mailreader.com/
===================================================================





More information about the Mailman-i18n mailing list