Encoding conundrum

Daniel Klein danielkleinad at gmail.com
Tue Nov 20 16:49:48 EST 2012


With the assistance of this group I am understanding unicode encoding
issues much better; especially when handling special characters that are
outside of the ASCII range. I've got my application working perfectly now
:-)

However, I am still confused as to why I can only use one specific encoding.

I've done some research and it appears that I should be able to use any of
the following codecs with codepoints '\xfc' (chr(252)) '\xfd' (chr(253))
and '\xfe' (chr(254)) :

ISO-8859-1   [ note that I'm using this codec on my Linux box ]
cp1252
cp437
latin1
utf-8

If I'm not mistaken, all of these codecs can handle the complete 8bit
character set.

However, on Windows 7, I am only able to use 'cp437' to display (print)
data with those characters in Python. If I use any other encoding, Windows
laughs at me with this error message:

  File "C:\Python33\lib\encodings\cp437.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\xfd' in
position 3: character maps to <undefined>

Furthermore I get this from IDLE:

>>> import locale
>>> locale.getdefaultlocale()
('en_US', 'cp1252')

I also get 'cp1252' when running the same script from a Windows command
prompt.

So there is a contradiction between the error message and the default
encoding.

Why am I restricted from using just that one codec? Is this a Windows or
Python restriction? Please enlighten me.

Dan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20121120/1715740c/attachment.html>


More information about the Python-list mailing list