[Python-Dev] Re: gettext in the standard library

Barry A. Warsaw bwarsaw@beopen.com
Fri, 18 Aug 2000 17:49:23 -0400 (EDT)


>>>>> "M" == M  <mal@lemburg.com> writes:

    M> I know that gettext is a standard, but from a technology POV I
    M> would have implemented this as codec wich can then be plugged
    M> wherever l10n is needed, since strings have the new .encode()
    M> method which could just as well be used to convert not only the
    M> string into a different encoding, but also a different
    M> language.  Anyway, just a thought...

That might be cool to play with, but I haven't done anything with
Python's Unicode stuff (and painfully little with gettext too) so
right now I don't see how they'd fit together.  My gut reaction is
that gettext could be the lower level interface to
string.encode(language).

    M> What I'm missing in your doc-string is a reference as to how
    M> well gettext works together with Unicode. After all, i18n is
    M> among other things about international character sets.
    M> Have you done any experiments in this area ?

No, but I've thought about it, and I don't think the answer is good.
The GNU gettext functions take and return char*'s, which probably
isn't very compatible with Unicode.  _gettext therefore takes and
returns PyStringObjects.

We could do better with the pure-Python implementation, and that might
be a good reason to forgo any performance gains or platform-dependent
benefits you'd get with _gettext.  Of course the trick is using the
Unicode-unaware tools to build .mo files containing Unicode strings.
I don't track GNU gettext developement close enough to know whether
they are addressing Unicode issues or not.

-Barry