Character encoding & the copyright symbol

Philip Semanchuk philip at semanchuk.com
Thu Aug 6 12:31:27 EDT 2009


On Aug 6, 2009, at 12:14 PM, Robert Dailey wrote:

> Hello,
>
> I'm loading a file via open() in Python 3.1 and I'm getting the
> following error when I try to print the contents of the file that I
> obtained through a call to read():
>
> UnicodeEncodeError: 'charmap' codec can't encode character '\xa9' in
> position 1650: character maps to <undefined>
>
> The file is defined as ASCII and the copyright symbol shows up just
> fine in Notepad++. However, Python will not print this symbol. How can
> I get this to work? And no, I won't replace it with "(c)". Thanks!

If the file is defined as ASCII and it contains 0xa9, then the file  
was written incorrectly or you were told the wrong encoding. There is  
no such character in ASCII which runs from 0x00 - 0x7f.

The copyright symbol == 0xa9 if the encoding is ISO-8859-1 or  
windows-1252, and since you're on Windows the latter is a likely bet.

http://en.wikipedia.org/wiki/Ascii
http://en.wikipedia.org/wiki/Iso-8859-1
http://en.wikipedia.org/wiki/Windows-1252


Bottom line is that your file is not in ASCII. Try specifying  
windows-1252 as the encoding. Without seeing your code I can't tell  
you where you need to specify the encoding, but the Python docs should  
help you out.


HTH
Philip




More information about the Python-list mailing list