Umlauts, encodings, sitecustomize.py

Jeff Epler jepler at unpythonic.net
Tue Nov 9 11:53:06 EST 2004


No matter what you do, you won't change this behavior:
>>> chr(0x84)
'\x84'

str.__repr__ always escapes characters in the range 0..31 and 127..255,
no matter what the locale is.

>>> print chr(0x84)
will behave differently (it will write that byte to standard output,
followed by a newline)

You should note that chr(0x84) is *not* a-umlaut in iso-8859-1.  That's
chr(0xe4).  You may be using one of these Windows-specific encodings:
    cp437.py:       0x0084: 0x00e4, # LATIN SMALL LETTER A WITH DIAERESIS
    cp775.py:       0x0084: 0x00e4, # LATIN SMALL LETTER A WITH DIAERESIS
    cp850.py:       0x0084: 0x00e4, # LATIN SMALL LETTER A WITH DIAERESIS
    cp852.py:       0x0084: 0x00e4, # LATIN SMALL LETTER A WITH DIAERESIS
    cp857.py:       0x0084: 0x00e4, # LATIN SMALL LETTER A WITH DIAERESIS
    cp861.py:       0x0084: 0x00e4, # LATIN SMALL LETTER A WITH DIAERESIS
    cp865.py:       0x0084: 0x00e4, # LATIN SMALL LETTER A WITH DIAERESIS

Jeff
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-list/attachments/20041109/a81abc43/attachment.sig>


More information about the Python-list mailing list