Printing characters outside of the ASCII range

Sat Nov 10 05:09:25 EST 2012

Le vendredi 9 novembre 2012 18:17:54 UTC+1, danielk a écrit :
> I'm converting an application to Python 3. The app works fine on Python 2.
> 
> 
> 
> Simply put, this simple one-liner:
> 
> 
> 
> print(chr(254))
> 
> 
> 
> errors out with:
> 
> 
> 
> Traceback (most recent call last):
> 
>   File "D:\home\python\tst.py", line 1, in <module>
> 
>     print(chr(254))
> 
>   File "C:\Python33\lib\encodings\cp437.py", line 19, in encode
> 
>     return codecs.charmap_encode(input,self.errors,encoding_map)[0]
> 
> UnicodeEncodeError: 'charmap' codec can't encode character '\xfe' in position 0: character maps to <undefined>
> 
> 
> 
> I'm using this character as a delimiter in my application.
> 
> 
> 
> What do I have to do to convert this string so that it does not error out?

-----

There is nothing wrong in having the character with
the code point 0xfe in the cp437 coding scheme as
a delimiter.

If it is coming from a byte string, you should
decode it properly

>>> b'=\xfe=\xfe='.decode('cp437')
'=■=■='

or you can use directly the unicode equivalent

>>> '=\u25a0=\u25a0='
'=■=■='

That's for "input". For "output" see:
http://groups.google.com/group/comp.lang.python/browse_thread/thread/c29f2f7f5a4962e8#

The choice of that character as a delimiter is not wrong.
It's a little bit unfortunate, because it falls high in
the "unicode table".

>>> import fourbiunicode as fu
>>> fu.UnicodeBlock('\u25a0')
'Geometric Shapes'
>>>
>>> fu.UnicodeBlock(b'\xfe'.decode('cp437'))
'Geometric Shapes'

(Another form of explanation)
jmf