[I18n-sig] JapaneseCodecs 1.4.8 released
Walter Dörwald
walter@livinglogic.de
Fri, 06 Sep 2002 12:24:53 +0200
Martin v. Loewis wrote:
> Tamito KAJIYAMA <kajiyama@grad.sccs.chukyo-u.ac.jp> writes:
> [...]
>
>>Sorry, I not sure I've got the picture of what transliteration
>>support would do. Transliteration support is meant to solve
>>interoperability problems due to differences among vendor-
>>specific mappings, right?
>
>
> No. In general, transliteration adds one-way mappings, to allow
> mapping a larger subset of Unicode to the target mapping. For example,
> "ö" is not supported in ASCII, but a common transliteration (for
> German) is to write "oe". So, u"\u00f6".encode("ascii") raises a
> UnicodeError, where u"\u00f6".encode("ascii//translit-german") might
> return "oe" (this is not implemented in Python).
But it's simple to implement as an PEP 293 error handling callback:
# -*- coding: iso-8859-1 -*-
import codecs
translit_german_map = {
ord(u"ö"): u"oe",
ord(u"ä"): u"ae",
ord(u"ü"): u"ue",
ord(u"ß"): u"ss"
}
def translit_german(exc):
if isinstance(exc, UnicodeEncodeError):
return (exc.object[exc.start:exc.end]. \
translate(translit_german_map), exc.end)
else:
raise TypeError("Don't know how to handle %r" % exc)
codecs.register_error("translit-german", translit_german)
u"-ä-ö-ü-ß-".encode("ascii", "translit-german")
Could transliteration for the JapaneseCodecs be handled
in a similar way?
Bye,
Walter Dörwald