Hello gettext

James T. Dennis jadestar at idiom.com
Mon May 14 21:04:20 EDT 2007


James T. Dennis <jadestar at idiom.com> wrote:

 ... just to follow-up my own posting --- as gauche as that is:

> You'd think that using things like gettext would be easy.  Superficially
> it seems well documented in the Library Reference(*).  However, it can
> be surprisingly difficult to get the external details right.

>        * http://docs.python.org/lib/node738.html 

> Here's what I finally came up with as the simplest instructions, suitable
> for an "overview of Python programming" class:

> Start with the venerable "Hello, World!" program ... slightly modified
> to make it ever-so-slightly more "functional:"



>        #!/usr/bin/env python
>        import sys

>        def hello(s="World"):
>            print "Hello,", s

>        if __name__ == "__main__":
>            args = sys.argv[1:]
>            if len(args):
>                for each in args:
>                    hello(each)
>            else:
>                hello()

> ... and add gettext support (and a little os.path handling on the
> assumption that our message object files will not be readily
> installable into the system /usr/share/locale tree):

>        #!/usr/bin/env python
>        import sys, os, gettext

>        _ = gettext.lgettext
>        mydir = os.path.realpath(os.path.dirname(sys.argv[0]))
>        localedir = os.path.join(mydir, "locale")
>        gettext.bindtextdomain('HelloPython', localedir)
>        gettext.textdomain('HelloPython')

>        def hello(s=_("World")):
>            print _("Hello,"), s

 Turns out this particular version is a Bad Idea(TM) if you ever
 try to import this into another script and use it after changing
 you os.environ['LANG'] value.

 I mentioned in another message awhile back that I have an aversion
 to using defaulted arguments other than by setting them as "None"
 and I hesitated this time and then thought: "Oh, it's fine in this
 case!"

 Here's my updated version of this script:

 -----------------------------------------------------------------------

#!/usr/bin/env python
import gettext, os, sys

_ = gettext.lgettext
i18ndomain = 'HelloPython'
mydir = os.path.realpath(os.path.dirname(sys.argv[0]))
localedir = os.path.join(mydir, "locale")
gettext.install(i18ndomain, localedir=None, unicode=1)
gettext.bindtextdomain(i18ndomain, localedir)
gettext.textdomain(i18ndomain)

def hello(s=None):
    """Print "Hello, World" (or its equivalent in any supported language):

       Examples:
          >>> os.environ['LANG']=''
          >>> hello()
          Hello, World
          >>> os.environ['LANG']='es_ES'
          >>> hello()
          Hola, Mundo
          >>> os.environ['LANG']='fr_FR'
          >>> hello()
          Bonjour, Monde

    """
    if s is None:
        s = _("World")
    print _("Hello,"), s

def test():
    import doctest
    doctest.testmod()

if __name__ == "__main__":
    args = sys.argv[1:]
    if 'PYDOCTEST' in os.environ and os.environ['PYDOCTEST']:
        test()
    elif len(args):
        for each in args:
            hello(each)
    else:
        hello()

 -----------------------------------------------------------------------

 ... now with doctest support. :)

>        if __name__ == "__main__":
>            args = sys.argv[1:]
>            if len(args):
>                for each in args:
>                    hello(each)
>            else:
>                hello()


> Note that I've only added five lines, the two modules to my import
> line, and wrapped two strings with the conventional _() function.

> This part is easy, and well-documented.

> Running pygettext or GNU xgettext (-L or --language=Python) is
> also easy and gives us a file like:


>        # SOME DESCRIPTIVE TITLE.
>        # Copyright (C) YEAR ORGANIZATION
>        # FIRST AUTHOR <EMAIL at ADDRESS>, YEAR.
>        #
>        msgid ""
>        msgstr ""
>        "Project-Id-Version: PACKAGE VERSION\n"
>        "POT-Creation-Date: 2007-05-14 12:19+PDT\n"
>        "PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
>        "Last-Translator: FULL NAME <EMAIL at ADDRESS>\n"
>        "Language-Team: LANGUAGE <LL at li.org>\n"
>        "MIME-Version: 1.0\n"
>        "Content-Type: text/plain; charset=CHARSET\n"
>        "Content-Transfer-Encoding: ENCODING\n"
>        "Generated-By: pygettext.py 1.5\n"


>        #: HelloWorld.py:10
>        msgid "World"
>        msgstr ""

>        #: HelloWorld.py:11
>        msgid "Hello,"
>        msgstr ""

> ... I suppose I should add the appropriate magic package name,
> version, author and other values to my source.  Anyone remember
> where those are documented?  Does pygettext extract them from the
> sources and insert them into the .pot?

> Anyway, I minimally have to change one line thus:

>        "Content-Type: text/plain; charset=utf-8\n"

> ... and I suppose there are other ways to do this more properly.
> (Documented where?)

> I did find that I could either change that in the .pot file or
> in the individual .po files.  However, if I failed to change it
> then my translations would NOT work and would throw an exception.

> (Where is the setting to force the _() function to fail gracefully
> --- falling back to no-translation and NEVER raise exceptions?
> I seem to recall there is one somewhere --- but I just spent all
> evening reading the docs and various Google hits to get this far; so
> please excuse me if it's a blur right now).

> Now we just copy these templates to individual .po files and
> make our LC_MESSAGES directories:

>        mkdir locale && mv HelloPython.pot locale
>        cd locale

>        for i in es_ES fr_FR # ...
>            do
>                cp HelloPython.pot HelloPython_$i.po
>                mkdir -p $i/LC_MESSAGES
>            done

> ... and finally we can work on the translations.

> We edit each of the _*.po files inserting "Hola" and "Bonjour" and
> "Mundo" and "Monde" in the appropriate places.  And then process
> these into .mo files and move them into place as follows:

>        for i in *_*.po; do 
>                i=${i#*_} 
>                msgfmt -o ./${i%.po}/LC_MESSAGES/HelloPython.mo 
>                done

> ... in other words HelloPython_es_ES.po is written to 
> ./es_ES/LC_MESSAGES/HelloPython.mo, etc.

> This last part was the hardest to get right. 

> To test this we simply run:

>        $HELLO_PATH/HelloPython.py
>        Hello, World

>        export LANG=es_ES
>        $HELLO_PATH/HelloPython.py
>        Hola, Mundo

>        export LANG=fr_FR
>        $HELLO_PATH/HelloPython.py
>        Bonjour, Monde

>        export LANG=zh_ZH
>        $HELLO_PATH/HelloPython.py
>        Hello, World

> ... and we find that our Spanish and French translations work. (With
> apologies if my translations are technically wrong).

> Of course I realize this only barely scratches the surface of I18n and
> L10n issues.  Also I don't know, offhand, how much effort would be
> required to make even this trivial example work on an MS Windows box.
> It would be nice to find a document that would cover the topic in more
> detail while still giving a sufficiently clear and concise set of examples
> that one could follow them without getting hung up on something stupid
> like: "Gee! You have to create $LANG/LC_MESSAGES/ directories and put
> the .mo files thereunder; the Python won't find them under directly
> under $LANG nor under LC_MESSAGES/$LANG" ... and "Gee!  For reasons
> I don't yet understand you need call both the .bindtextdomain() AND 
> the .textdomain() functions."  ... and even "Hmmm ... seems that we
> don't need to import locale and call local.setlocale() despite what
> some examples in Google seem to suggest"(*)

>        * http://www.pixelbeat.org/programming/i18n.html

> (So, when to you need that and when is gettext.install() really
> useful?)

> (I gather that the setlocale() stuff is not for simple string
> translations but for things like numeric string formatting 
> with "%d" % ... for example).

-- 
Jim Dennis,
Starshine: Signed, Sealed, Delivered




More information about the Python-list mailing list