How do I display unicode value stored in a string variable using ord()

Paul Rubin no.email at nospam.invalid
Sun Aug 19 04:44:44 EDT 2012


Chris Angelico <rosuav at gmail.com> writes:
> And of course, taking the *entire* rest of the string isn't the only
> thing you do. What if you want to take the next six characters after
> that index? That would be constant time with a fixed-width storage
> format.

How often is this an issue in practice?

I wonder how other languages deal with this.  The examples I can think
of are poor role models:

1. C/C++ - unicode impaired, other than a wchar type

2. Java - bogus UCS-2-like(?) representation for historical reasons
   Also has some modified UTF=8 for reasons that made no sense and
   that I don't remember

3. Haskell - basic string type is a linked list of code points.
   "hello" is five list nodes.  New Data.Text library (much more
    efficient) uses something like ropes, I think, with UTF-16 underneath.

4. Erlang - I think like Haskell.  Efficiently handles byte blocks.

5. Perl 6 -- ???

6. Ruby - ??? (but probably quite slow like the rest of Ruby)

7. Objective C -- ???

8, 9 ...  (any other important ones?)



More information about the Python-list mailing list