How do I display unicode value stored in a string variable using ord()

Steven D'Aprano steve+comp.lang.python at pearwood.info
Sun Aug 19 02:35:11 EDT 2012


On Sat, 18 Aug 2012 19:34:50 +0100, MRAB wrote:

> "a" will be stored as 1 byte/codepoint.
> 
> Adding "é", it will still be stored as 1 byte/codepoint.

Wrong. It will be 2 bytes, just like it already is in Python 3.2.

I don't know where people are getting this myth that PEP 393 uses Latin-1 
internally, it does not. Read the PEP, it explicitly states that 1-byte 
formats are only used for ASCII strings.


> Adding "€", it will still be stored as 2 bytes/codepoint.

That is correct.



-- 
Steven



More information about the Python-list mailing list