[Mailman-i18n] More problems ;)

Martin von Loewis loewis@informatik.hu-berlin.de
Wed, 25 Jul 2001 09:25:57 +0200 (MEST)


> I encountered at least one other problem with a construction of a line:
> 
> #: Mailman/HTMLFormatter.py:78
> msgid "<em>(%(num_concealed)d private member%(plu)s not shown)</em>"
> msgstr "<em>(%(num_concealed)d priv&eacute; lid/leden niet getoond)</em>"
> 
> The dutch plural of member ('lid') is not created by adding an 's' (it's
> "leden".) In fact, it's generally not true that "s" makes the best plurals
> ("en" would be better, in Dutch, though that wouldn't fix it in this case.)
> 
> Barry, shall I just rewrite such cases to make them more easily
> translatable?

Getting this right is non-trivial: a number of languages
(e.g. Russian) have more than two forms that depend on number, e.g. 1,
2-4, and 5 or more; sometimes these repeat following certain formulas,
e.g. 101 might require the same grammatical form as 1. The gettext
manual gives the Polish example

     In Polish we use e.g. plik (file) this way:
          1 plik
          2,3,4 pliki
          5-21 pliko'w
          22-24 pliki
          25-31 pliko'w
     and so on (o' means 8859-2 oacute which should be rather okreska,
     similar to aogonek).

The new GNU gettext makes an attempt to solve this; it introduces a
function ngettext which allows to write

          printf (ngettext ("%d file removed", "%d files removed", n), n);

These two strings are the English fallbacks in case no translation is
available.

In addition, the header of the PO file gets an additional field
Plural-Forms, which may read

     Plural-Forms: nplurals=2; plural=n == 1 ? 0 : 1;

Other definitions may read

          Plural-Forms: nplurals=2; plural=n>1;

e.g. for French (0 and 1 are "singular"),

          Plural-Forms: nplurals=3; \
              plural=n%10==1 && n%100!=11 ? 0 : \
                     n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2;

e.g. for Czech, Russian, Slovak, Ukrainian, etc; see the gettext
manual of 0.10.38 for further elaboration.

Finally, the actual msgstrs for the plural forms are listed using an
indexed syntax, i.e.

     msgid UNTRANSLATED-STRING-SINGULAR
     msgid_plural UNTRANSLATED-STRING-PLURAL
     msgstr[0] TRANSLATED-STRING-CASE-0
     ...
     msgstr[N] TRANSLATED-STRING-CASE-N

People have varying opinions about the complexity of this
technology. I'm not proposing that Python gettext supports the same
notation - I just point out that plurals are more difficult than you
may think they are.

Regards,
Martin