[I18n-sig] Codecs for Big Five and GB 2312

M.-A. Lemburg mal@lemburg.com
Fri, 27 Oct 2000 10:02:11 +0200


Tamito KAJIYAMA wrote:
> 
> Tom Emerson <tree@basistech.com> writes:
> | I need codecs for transcoding to and from Big Five and GB 2312: has
> | anyone written these yet? If not, I'll do it, but I would rather not
> | duplicate the work.
> 
> I've maintained a codecs package named JapaneseCodecs which
> contains two Japanese encodings EUC-JP and Shift JIS.  The two
> encodings and Big5 are all 8-bit encodings, so you may use my
> codecs as a starting point for implementing a Big5 codec.  The
> JapaneseCodecs package is available at:
> 
> http://pseudo.grad.sccs.chukyo-u.ac.jp/~kajiyama/python/
> 
> For personal use I also wrote a preliminary codec for a subset
> of ISO 2022 (or exactly speaking, a subset of the Emacs/MULE
> internal encoding, which in turn an extension of ISO 2022).
> Currently the codec can handle a text that contains Japanese,
> Thai, and Vietnamese characters.  The codec is written without
> efficiency consideration, but it works.  Since GB 2312 is an
> encoding based on ISO 2022, the codec may be a starting point,
> too.  The only things that need to be done for handling GB 2312
> is to add a character mapping and escape sequences for
> designating character sets.  If you are interested, the codec is
> available at:
> 
> http://pseudo.grad.sccs.chukyo-u.ac.jp/~kajiyama/python/iso_2022_7bit.py.gz

Andy, I think you ought to put these links on the i18n-sig web
page... if someone finds some time, I think it would be worth-
while starting a topic guide for Unicode which also includes
all these valuable resources.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/