[I18n-sig] Unicode surrogates: just say no!

Tom Emerson tree@basistech.com
Thu, 28 Jun 2001 14:00:47 -0400


Paul Prescod writes:
> "Martin v. Loewis" wrote:
[snip]
> > ... Furthermore, people
> > using non-BMP characters in source are probably not very interested in
> > counting the characters: They want to display them. For just
> > displaying them, you need to represent them, and you need the fonts.
> > String manipulation is less important.
> 
> What are the chances that anybody is in this situation in the near
> future? Can you even display these characters on Windows? Does Tk
> support them? And if so, on what platforms? What about the Java APIs?
> (once again, these are real, not rhetorical questions)

I can't speak for the characters in plane 1, but the characters in
plane 2 have fonts available already for those who need them.

Also, plane 14 contains code-points that *would* be used for both
display and text processing applications.

Finally I would expect that those using the ideographs in plane 2 care
less about display than they do being able to encode and manipulate
the data. Either the characters are used in names which must be put
into databases and the like, or they are being used to encode
historical documents for searching and the like. While display is
important, I strongly suggest that the ability to display them does
not outweigh the ability to work with strings containing them.

    -tree

-- 
Tom Emerson                                          Basis Technology Corp.
Sr. Sinostringologist                              http://www.basistech.com
  "Beware the lollipop of mediocrity: lick it once and you suck forever"