How do I display unicode value stored in a string variable using ord()

Ian Kelly ian.g.kelly at gmail.com
Sun Aug 19 14:46:27 EDT 2012


On Sun, Aug 19, 2012 at 11:50 AM, Ian Kelly <ian.g.kelly at gmail.com> wrote:
> Note that this only describes the structure of "compact" string
> objects, which I have to admit I do not fully understand from the PEP.
>  The wording suggests that it only uses the PyASCIIObject structure,
> not the derived structures.  It then says that for compact ASCII
> strings "the UTF-8 data, the UTF-8 length and the wstr length are the
> same as the length of the ASCII data."  But these fields are part of
> the PyCompactUnicodeObject structure, not the base PyASCIIObject
> structure, so they would not exist if only PyASCIIObject were used.
> It would also imply that compact non-ASCII strings are stored
> internally as UTF-8, which would be surprising.

Oh, now I get it.  I had missed the part where it says "character data
immediately follow the base structure".  And the bit about the "UTF-8
data, the UTF-8 length and the wstr length" are not describing the
contents of those fields, but rather where the data can be alternatively
found since the fields don't exist.



More information about the Python-list mailing list