How do I display unicode value stored in a string variable using ord()

rusi rustompmody at gmail.com
Mon Aug 20 02:13:24 EDT 2012


On Aug 19, 11:11 pm, wxjmfa... at gmail.com wrote:
> Le dimanche 19 août 2012 19:48:06 UTC+2, Paul Rubin a écrit :
>
>
>
> > But they are not ascii pages, they are (as stated) MOSTLY ascii.
>
> > E.g. the characters are 99% ascii but 1% non-ascii, so 393 chooses
>
> > a much more memory-expensive encoding than UTF-8.

>
>
> Well, it seems some software producers know what they
> are doing.
>
> >>> '€'.encode('cp1252')
> b'\x80'
> >>> '€'.encode('mac-roman')
> b'\xdb'
> >>> '€'.encode('iso-8859-1')
>
> Traceback (most recent call last):
>   File "<eta last command>", line 1, in <module>
> UnicodeEncodeError: 'latin-1' codec can't encode character '\u20ac'
> in position 0: ordinal not in range(256)

<facetious>
You want the Euro-sign in iso-8859-1??
I object. I want the rupee sign ( ₹ ) http://en.wikipedia.org/wiki/Indian_rupee_sign

And while we are at it, why not move it (both?) into ASCII?
</facetious>

The problem(s) are:
1. We dont really understand what you are objecting to.
2. Utf-8 like Huffman coding is a prefix code
http://en.wikipedia.org/wiki/Prefix_code#Prefix_codes_in_use_today
Like Huffman coding, it compresses based on a statistical argument.
3. Unlike Huffman coding the statistics is very political: "Is the
Euro more important or Chinese ideograms?" depends on whom you ask



More information about the Python-list mailing list