How do I display unicode value stored in a string variable using ord()

Roy Smith roy at panix.com
Sun Aug 19 19:24:30 EDT 2012


In article <mailman.3531.1345416176.4697.python-list at python.org>,
 Chris Angelico <rosuav at gmail.com> wrote:

> Really, the only viable alternative to PEP 393 is a fixed 32-bit
> representation - it's the only way that's guaranteed to provide
> equivalent semantics. The new storage format is guaranteed to take no
> more memory than that, and provide equivalent functionality.

In the primordial days of computing, using 8 bits to store a character 
was a profligate waste of memory.  What on earth did people need with 
TWO cases of the alphabet (not to mention all sorts of weird 
punctuation)?  Eventually, memory became cheap enough that the 
convenience of using one character per byte (not to mention 8-bit bytes) 
outweighed the costs.  And crazy things like sixbit and rad-50 got swept 
into the dustbin of history.

So it may be with utf-8 someday.

Clearly, the world has moved to a 32-bit character set.  Not all parts 
of the world know that yet, or are willing to admit it, but that doesn't 
negate the fact that it's true.  Equally clearly, the concept of one 
character per byte is a big win.  The obvious conclusion is that 
eventually, when memory gets cheap enough, we'll all be doing utf-32 and 
all this transcoding nonsense will look as antiquated as rad-50 does 
today.



More information about the Python-list mailing list