Replace accented chars with unaccented ones

Jeff Epler jepler at unpythonic.net
Tue Mar 16 09:00:36 EST 2004


On Tue, Mar 16, 2004 at 08:26:08AM +0100, Nicolas Bouillon wrote:
> Thank you both for your answer. They works well both very good.
> 
> First, i believe i doesn't work, because the error i've made is to 
> forgot the "u" for string : u"é". Because my file was already utf-8 
> encoded (# -*- coding: UTF-8 -*-), i thinks the "u" is not necessary... 
> i was wrong.

When there are non-unicode string literals in a file, they are simply
byte sequences.  Take this program, for instance:

# -*- coding: utf-8 -*-
s = "é"
print len(s), repr(s)

$ python bytestr.py
2 '\xc3\xa9'

Jeff




More information about the Python-list mailing list