Spanish Accents

Peter Otten __peter__ at web.de
Thu Dec 22 11:42:09 EST 2011


Stan Iverson wrote:

> On Thu, Dec 22, 2011 at 11:30 AM, Rami Chowdhury
> <rami.chowdhury at gmail.com>wrote:
> 
>> Could you try using the 'open' function from the 'codecs' module?
>>
> 
> I believe this is what you meant:
> 
> file = codecs.open(p + "2.txt", "r", "utf-8")
> for line in file:
>   print line
> 
> but got this error:
> 

> *UnicodeDecodeError*: 'utf8' codec can't decode bytes in position 0-2:
> invalid data
>       args = ('utf8', '\xe1 intentado para ellos bastante sabios para
> discernir lo obvio. Tales perso', 0, 3, 'invalid data')

> which is the letter á (a with accent).

The file is probably encoded in ISO-8859-1, ISO-8859-15, or cp1252 then:

>>> print "\xe1".decode("iso-8859-1")
á
>>> print "\xe1".decode("iso-8859-15")
á
>>> print "\xe1".decode("cp1252")
á

Try codecs.open() with one of these encodings.





More information about the Python-list mailing list