[I18n-sig] JapaneseCodecs 1.4.8 released

Fri, 06 Sep 2002 12:24:53 +0200

Martin v. Loewis wrote:

> Tamito KAJIYAMA <kajiyama@grad.sccs.chukyo-u.ac.jp> writes:
 > [...]
> 
>>Sorry, I not sure I've got the picture of what transliteration
>>support would do.  Transliteration support is meant to solve
>>interoperability problems due to differences among vendor-
>>specific mappings, right?
> 
> 
> No. In general, transliteration adds one-way mappings, to allow
> mapping a larger subset of Unicode to the target mapping. For example,
> "ö" is not supported in ASCII, but a common transliteration (for
> German) is to write "oe". So, u"\u00f6".encode("ascii") raises a
> UnicodeError, where u"\u00f6".encode("ascii//translit-german") might
> return "oe" (this is not implemented in Python).

But it's simple to implement as an PEP 293 error handling callback:

# -*- coding: iso-8859-1 -*-
import codecs

translit_german_map = {
    ord(u"ö"): u"oe",
    ord(u"ä"): u"ae",
    ord(u"ü"): u"ue",
    ord(u"ß"): u"ss"
}

def translit_german(exc):
    if isinstance(exc, UnicodeEncodeError):
       return (exc.object[exc.start:exc.end]. \
          translate(translit_german_map), exc.end)
    else:
       raise TypeError("Don't know how to handle %r" % exc)

codecs.register_error("translit-german", translit_german)

u"-ä-ö-ü-ß-".encode("ascii", "translit-german")

Could transliteration for the JapaneseCodecs be handled
in a similar way?

Bye,
    Walter Dörwald