Printing characters outside of the ASCII range
wxjmfauth at gmail.com
wxjmfauth at gmail.com
Sat Nov 10 05:09:25 EST 2012
Le vendredi 9 novembre 2012 18:17:54 UTC+1, danielk a écrit :
> I'm converting an application to Python 3. The app works fine on Python 2.
>
>
>
> Simply put, this simple one-liner:
>
>
>
> print(chr(254))
>
>
>
> errors out with:
>
>
>
> Traceback (most recent call last):
>
> File "D:\home\python\tst.py", line 1, in <module>
>
> print(chr(254))
>
> File "C:\Python33\lib\encodings\cp437.py", line 19, in encode
>
> return codecs.charmap_encode(input,self.errors,encoding_map)[0]
>
> UnicodeEncodeError: 'charmap' codec can't encode character '\xfe' in position 0: character maps to <undefined>
>
>
>
> I'm using this character as a delimiter in my application.
>
>
>
> What do I have to do to convert this string so that it does not error out?
-----
There is nothing wrong in having the character with
the code point 0xfe in the cp437 coding scheme as
a delimiter.
If it is coming from a byte string, you should
decode it properly
>>> b'=\xfe=\xfe='.decode('cp437')
'=■=■='
or you can use directly the unicode equivalent
>>> '=\u25a0=\u25a0='
'=■=■='
That's for "input". For "output" see:
http://groups.google.com/group/comp.lang.python/browse_thread/thread/c29f2f7f5a4962e8#
The choice of that character as a delimiter is not wrong.
It's a little bit unfortunate, because it falls high in
the "unicode table".
>>> import fourbiunicode as fu
>>> fu.UnicodeBlock('\u25a0')
'Geometric Shapes'
>>>
>>> fu.UnicodeBlock(b'\xfe'.decode('cp437'))
'Geometric Shapes'
(Another form of explanation)
jmf
More information about the Python-list
mailing list