Character encoding & the copyright symbol

Albert Hopkins marduk at letterboxes.org
Thu Aug 6 12:45:43 EDT 2009


On Thu, 2009-08-06 at 09:14 -0700, Robert Dailey wrote:
> Hello,
> 
> I'm loading a file via open() in Python 3.1 and I'm getting the
> following error when I try to print the contents of the file that I
> obtained through a call to read():
> 
> UnicodeEncodeError: 'charmap' codec can't encode character '\xa9' in
> position 1650: character maps to <undefined>
> 
> The file is defined as ASCII and the copyright symbol shows up just
> fine in Notepad++. However, Python will not print this symbol. How can
> I get this to work? And no, I won't replace it with "(c)". Thanks!

It's not actually ASCII but Windows-1252 extended ASCII-like.  So with
that information you can do either of 2 things: You can open it in text
mode and specify the encoding:

>>> fp = open(filename, 'r', encoding='windows-1252')
>>> s = fp.read()
>>> print(s)

or you can open it in binary mode and decode it later:

>>> fp = open(filename, 'rb')
>>> b = fp.read()
>>> print(str(b, encoding='windows-1252'))

Or you may be able to set the default encoding to windows-1252 but I
don't know how to do that (in Windows).

p.s.

Next time it might be helpful to paste a code snippet else we have to
make assumptions about what you are actually doing.




More information about the Python-list mailing list