How do I display unicode value stored in a string variable using ord()

Jerry Hill malaclypse2 at gmail.com
Fri Aug 17 14:21:34 EDT 2012


On Fri, Aug 17, 2012 at 1:49 PM,  <wxjmfauth at gmail.com> wrote:
> The character '…', Unicode name 'HORIZONTAL ELLIPSIS',
> is one of these characters existing in the cp1252, mac-roman
> coding schemes and not in iso-8859-1 (latin-1) and obviously
> not in ascii. It causes Py3.3 to work a few 100% slower
> than Py<3.3 versions due to the flexible string representation
> (ascii/latin-1/ucs-2/ucs-4) (I found cases up to 1000%).
>
>>>> '…'.encode('cp1252')
> b'\x85'
>>>> '…'.encode('mac-roman')
> b'\xc9'
>>>> '…'.encode('iso-8859-1') # latin-1
> Traceback (most recent call last):
>   File "<eta last command>", line 1, in <module>
> UnicodeEncodeError: 'latin-1' codec can't encode character '\u2026'
> in position 0: ordinal not in range(256)
>
> If one could neglect this (typographically important) glyph, what
> to say about the characters of the European scripts (languages)
> present in cp1252 or in mac-roman but not in latin-1 (eg. the
> French script/language)?

So... python should change the longstanding definition of the latin-1
character set?  This isn't some sort of python limitation, it's just
the reality of legacy encodings that actually exist in the real world.


> Very nice. Python 2 was built for ascii user, now Python 3 is
> *optimized* for, let say, ascii user!
>
> The future is bright for Python. French users are better
> served with Apple or MS products, simply because these
> corporates know you can not write French with iso-8859-1.
>
> PS When "TeX" moved from the ascii encoding to iso-8859-1
> and the so called Cork encoding, "they" know this and provided
> all the complementary packages to circumvent this. It was
> in 199? (Python was not even born).
>
> Ditto for the foundries (Adobe, Linotype, ...)


I don't understand what any of this has to do with Python.  Just
output your text in UTF-8 like any civilized person in the 21st
century, and none of that is a problem at all.  Python make that easy.
 It also makes it easy to interoperate with older encodings if you
have to.

-- 
Jerry



More information about the Python-list mailing list