How do I display unicode value stored in a string variable using ord()

Paul Rubin no.email at nospam.invalid
Sun Aug 19 04:11:56 EDT 2012


Steven D'Aprano <steve+comp.lang.python at pearwood.info> writes:
>     result = text[end:]

if end not near the end of the original string, then this is O(N)
even with fixed-width representation, because of the char copying.

if it is near the end, by knowing where the string data area
ends, I think it should be possible to scan backwards from
the end, recognizing what bytes can be the beginning of code points and
counting off the appropriate number.  This is O(1) if "near the end"
means "within a constant".

> You could say "Screw the full Unicode standard, who needs more than 64K 

No if you're claiming the language supports unicode it should be
the whole standard.

> You could do what Python 3.2 narrow builds do: use UTF-16 and leave it
> up to the individual programmer to track character boundaries,

I'm surprised the Python 3 implementers even considered that approach
much less went ahead with it.  It's obviously wrong.

> You could add a whole lot more heavyweight infrastructure to strings,
> turn them into suped-up ropes-on-steroids.

I'm not persuaded that PEP 393 isn't even worse.



More information about the Python-list mailing list