trouble w/ unicode file

Serge Orlov sombDELETE at pobox.ru
Sun Jan 25 03:54:17 EST 2004


"Guilherme Salgado" <salgado at freeshell.org> wrote in message news:mailman.752.1074995324.12720.python-list at python.org...
> Hi there,
>
> I have a python source file encoded in unicode(utf-8) with some
> iso8859-1 strings. I've encoded this file as utf-8 in the hope that
> python will understand these strings as unicode (<type 'unicode'>)
> strings whithout the need to use unicode() or u"" on these strings. But
> this didn't happen.

You hoped, but you forgot to pray <wink> Why do you think Python
should behave this way? There is (an experimental?) option -U that
forces all string literals to be unicode. Obviously if you use this option
your sources won't be easily distributable to other people

C:\Python23>python -U
Python 2.3.3 (#51, Dec 18 2003, 20:22:39) [MSC v.1200 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> type('a')
<type 'unicode'>

> Am I expecting something that really shoudn't happen or we have a bug?

We have a bug here as well. But in your code. The coding must
be the same as the coding of your source file. bar.py must be:
#-*- coding: latin-1 -*-
x = 'ééééáááááííí'
print x, type(x)

-- Serge.





More information about the Python-list mailing list