[I18n-sig] Big5 Codecs

Frank J.S. Chen frank63@ms5.hinet.net
Wed, 1 Nov 2000 20:16:21 -0000


> We're talking across purposes here. There are codepoints in Big 5
> that, in the Unicode mapping table, map to U+FFFD. For example, what
> does your table map Big 5 0xA2CE to in Unicode? The problem is that
> 0xA4CA and 0xA2CE can map to U+5345. But if you see U+5345 which of
> these Big 5 code points do you map to?

I didn't consider that so far.
0xA4CA and 0xA2CE has "the same form but with different typeface"
in Chinese and are unified into Unicode Han character set. In fact, they
have an identical meaning. So no matter what code point  the dictionary
uses to convert from Unicode, things will not go badly wrong. But we still
need a strategy to filter them out for completeness.

I use Unicode mapping table for now, not vendor implementations.

----------------------------------------------------------------------------
-------
Chen Chien-Hsun
Taipei,Taiwan,R.O.C.