Unicode

Vlastimil Brom vlastimil.brom at gmail.com
Mon Dec 17 05:02:11 EST 2012


2012/12/17 Anatoli Hristov <tolidtm at gmail.com>:
> What happens when you do use UTF-8?
This is the result when I encode the string:
 " étroits, en utilisant un portable extrêmement puissant—le plus
 petit et le plus léger des HP EliteBook pleine puissance—avec un
 écran de diagonale 31,75 cm (12,5 pouces), idéal pour le
 professionnel ultra-mobile.
 "
 No accents
>

Hi,
if you only see encoding problems on printing results to your
terminal, its settings or unicode capability might be the cause,
however, if you also get badly encoding items in the database, you are
likely using an inappropriate encoding in some step.

you seem to be doing something like the following (explicitly or
partly implicitly, based on your system defaults):

>>> print u"étroits, en utilisant un portable extrêmement puissant".encode("utf-8").decode("windows-1252")
étroits, en utilisant un portable extrêmement puissant
>>>

i.e. encode a text using utf-8 and handling it like windows-1252
afterwards (or take an already encoded text and decode it with the
inappropriate ANSI encoding.

hth,
   vbr



More information about the Python-list mailing list