How do I display unicode value stored in a string variable using ord()

wxjmfauth at gmail.com wxjmfauth at gmail.com
Fri Aug 17 14:45:02 EDT 2012


Le vendredi 17 août 2012 20:21:34 UTC+2, Jerry Hill a écrit :
> On Fri, Aug 17, 2012 at 1:49 PM,  <wxjmfauth at gmail.com> wrote:
> 
> > The character '…', Unicode name 'HORIZONTAL ELLIPSIS',
> 
> > is one of these characters existing in the cp1252, mac-roman
> 
> > coding schemes and not in iso-8859-1 (latin-1) and obviously
> 
> > not in ascii. It causes Py3.3 to work a few 100% slower
> 
> > than Py<3.3 versions due to the flexible string representation
> 
> > (ascii/latin-1/ucs-2/ucs-4) (I found cases up to 1000%).
> 
> >
> 
> >>>> '…'.encode('cp1252')
> 
> > b'\x85'
> 
> >>>> '…'.encode('mac-roman')
> 
> > b'\xc9'
> 
> >>>> '…'.encode('iso-8859-1') # latin-1
> 
> > Traceback (most recent call last):
> 
> >   File "<eta last command>", line 1, in <module>
> 
> > UnicodeEncodeError: 'latin-1' codec can't encode character '\u2026'
> 
> > in position 0: ordinal not in range(256)
> 
> >
> 
> > If one could neglect this (typographically important) glyph, what
> 
> > to say about the characters of the European scripts (languages)
> 
> > present in cp1252 or in mac-roman but not in latin-1 (eg. the
> 
> > French script/language)?
> 
> 
> 
> So... python should change the longstanding definition of the latin-1
> 
> character set?  This isn't some sort of python limitation, it's just
> 
> the reality of legacy encodings that actually exist in the real world.
> 
> 
> 
> 
> 
> > Very nice. Python 2 was built for ascii user, now Python 3 is
> 
> > *optimized* for, let say, ascii user!
> 
> >
> 
> > The future is bright for Python. French users are better
> 
> > served with Apple or MS products, simply because these
> 
> > corporates know you can not write French with iso-8859-1.
> 
> >
> 
> > PS When "TeX" moved from the ascii encoding to iso-8859-1
> 
> > and the so called Cork encoding, "they" know this and provided
> 
> > all the complementary packages to circumvent this. It was
> 
> > in 199? (Python was not even born).
> 
> >
> 
> > Ditto for the foundries (Adobe, Linotype, ...)
> 
> 
> 
> 
> 
> I don't understand what any of this has to do with Python.  Just
> 
> output your text in UTF-8 like any civilized person in the 21st
> 
> century, and none of that is a problem at all.  Python make that easy.
> 
>  It also makes it easy to interoperate with older encodings if you
> 
> have to.
> 

Sorry, you missed the point.

My comment had nothing to do with the code source coding,
the coding of a Python "string" in the code source or with
the display of a Python3 <str>.
I wrote about the *internal* Python "coding", the
way Python keeps "strings" in memory. See PEP 393.

jmf



More information about the Python-list mailing list