[Python-Dev] File encodings

M.-A. Lemburg mal at egenix.com
Tue Nov 30 09:52:34 CET 2004


Gustavo Niemeyer wrote:
> Greetings,
> 
> Today, while trying to internationalize a program I'm working on,
> I found an interesting side-effect of how we're dealing with
> encoding of unicode strings while being written to files.
> 
> Suppose the following example:
> 
>   # -*- encoding: iso-8859-1 -*-
>   print u"á"
> 
> This will correctly print the string 'á', as expected. Now, what
> surprises me, is that the following code won't work in an equivalent
> way (unless using sys.setdefaultencoding()):
> 
>   # -*- encoding: iso-8859-1 -*-
>   import sys
>   sys.stdout.write(u"á\n")
> 
> This will raise the following error:
> 
>   Traceback (most recent call last):
>     File "asd.py", line 3, in ?
>       sys.stdout.write(u"á")
>   UnicodeEncodeError: 'ascii' codec can't encode character u'\xe1'
>                       in position 0:ordinal not in range(128)
> 
> This difference may become a really annoying problem when trying to
> internationalize programs, since it's usual to see third-party code
> dealing with sys.stdout, instead of using 'print'. The standard
> optparse module, for instance, has a reference to sys.stdout which
> is used in the default --help handling mechanism.

You are mixing things here:

The source encoding is meant for the
parser and defines the way Unicode literals are converted
into Unicode objects.

The encoding used on the stdout stream doesn't have anything
to do with the source code encoding and has to be handled
differently.

The idiom presented by Bob is the right way to go: wrap
sys.stdout with a StreamEncoder.

Using sys.setdefaultencoding() is *not* the right solution
to the problem.

In general when writing programs that are targetted for
i18n, you should use Unicode for all text data and
convert from Unicode to 8-bit only at the IO/UI layer.

The various wrappers in the codecs module make this
rather easy.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 30 2004)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::


More information about the Python-Dev mailing list