trouble w/ unicode file

"Martin v. Löwis" martin at v.loewis.de
Sun Jan 25 13:25:32 EST 2004


Serge Orlov wrote:

> Sorry, I was confused by your words "with some iso8859-1 strings".
> I thought you were using simple (unaware of encodings) editor and
> just added #-*- coding: utf-8 -*- with hope that it will work. You're
> right the coding should stay utf-8. After that you have two options:
> either use -U option or put u before every string.

There is a third option: Programmatically convert the strings to
Unicode, e.g.

# -*- coding: utf-8 -*-
s = "ééééáááááííí"
s = unicode(s, 'utf-8')

This assumes that you know thy source encoding at the point of
conversion.

Regards,
Martin




More information about the Python-list mailing list