[I18n-sig] Re: Patch 101320: doc strings

Martin von Loewis loewis@informatik.hu-berlin.de
Mon, 4 Sep 2000 15:29:25 +0200 (MET DST)


[Fran=E7ois Pinard]
> People might fear that the POT file is too time consuming to load all
> at once.  If this is the case, then the problem lies in the implementation
> of the `gettext' interface.  I repeated all along that it should be lazy
> evaluated, exactly to avoid that an insufficient implementation becomes
> an excuse to split a textual domain in many smaller ones.

I have started translating the Python doc strings into German, and
covered about 30% so far. Using the Python 2 gettext.py, I did not
experience any noticable delay in loading the mo file, on my 300MHz
machine. While I agree that lazy loading may become necessary, I think
it is ok to do implement the feature when the problem actually arises.
I'm pretty certain you can implement lazy access without changing the
existing API.

> People might fear that the PO file would take too much memory.  On
> modern systems, there is no problem `mmap'ing a file, as virtual
> address space is more than enough to hold even big translation
> files.  The Python difficulty, here, is that it is (nicely) portable
> to some less capable systems, where `mmap' has no equivalent.

The Python 2 mmap works on Unix and Win32. It probably is the best
solution if available.

> In my opinion, the solution might then be for these systems to load
> the MO hash tables only, and then retrieve messages from disk.

If you load the hash tables, does this give enough information so that
you can use two seek(2) calls only; on average? If so, it would be
probably good if there was a) documentation for the hash table format,
and/or b) an implementation of it in Python.

> The last fear might be that the POT file might be too big for
> translators to handle.

That indeed is my concern. The largest catalog so far was Lynx
(AFAICT), with 1100 messages. I guess gcc might also be pretty large.

> One of the goal of the Translation Project has been to promote a
> clean separation of responsibilities between software maintainers
> and national translators, as software maintainers spontaneously have
> a wide variety of (often contradictory) opinions about how (and even
> when!) translators should work :-).  It is a difficult aspect of the
> overall thing, in fact.

I think for the Python docstring catalog, we can give some guidance -
perhaps by shipping not all at once, but waiting for translators to
complete with the most interesting things first (like docstrings for
the builtin core functions).

I'm certain it will take some time to get translations back, so if=20
we want to have something in the next release (after 2.0), we should
start today.

Regards,
Martin