n00b question on spacing

Dave Angel davea at davea.name
Sat Jun 22 19:56:21 EDT 2013


On 06/22/2013 07:37 PM, Chris Angelico wrote:
> On Sun, Jun 23, 2013 at 9:28 AM, Dave Angel <davea at davea.name> wrote:
>> On 06/22/2013 07:12 PM, Chris Angelico wrote:
>>>
>>> On Sun, Jun 23, 2013 at 1:24 AM, Rick Johnson
>>> <rantingrickjohnson at gmail.com> wrote:
>>>>
>>>>     _fmtstr = "Item wrote to MongoDB database {0}, {1}"
>>>>     msg = _fmtstr.format(_arg1, _arg2)
>>>
>>>
>>> As a general rule, I don't like separating format strings and their
>>> arguments. That's one of the more annoying costs of i18n. Keep them in
>>> a single expression if you possibly can.
>>>
>>
>> On the contrary, i18n should be done with config files.  The format string

**as specified in the physical program**

>> is the key to the actual string which is located in the file/dict.
>> Otherwise you're shipping separate source files for each language -- blecch.

What I was trying to say is that the programmereze format string in the 
code is replaced at runtime by the French format string in the config file.

>
> The simplest way to translate is to localize the format string; that's
> the point of .format()'s named argument system (since it lets you
> localize in a way that reorders the placeholders). What that does is
> it puts the format string away in a config file, while the replaceable
> parts are here in the source. That's why I say that's a cost of i18n -
> it's a penalty that has to be paid in order to move text strings away.



Certainly the reorderability of the format string is significant.  Not 
only can it be reordered, but more than one instance of some of the 
values is permissible if needed.  (What's missing is a decent handling 
of such things as singular/plural, where you want a different version 
per country of one (or a few) words from the format string, based on 
whether a value is exactly 1.)

But the language is missing the indirection I described.  So you have to 
use a (function or whatever) wrapper to look up the actual format string 
in the config file.  My point is by making that file equivalent to a 
dict, you get to have an executable program in "programmereze" before 
creating any config files, but still able to handle any real language 
with one config file per language.

This is much preferable to the usual numeric lookup, where somebody 
specifies the 17th format string to be used at this place in the code. 
Even when you use C++ names, they're still only a crude approximation to 
the real purpose of the string.

>
>> The program that's intended to be internationalized is written using
>> "programmereze" strings.  That's a strange inhuman language that's only
>> approximately comprehensible by the developer and close associates. Then
>> that gets translated into a bunch of language-specific config files, with
>> English probably being one of them.
>
> Heh. That's one way of looking at it... I don't really know what
> language we speak; at what point is it deemed a separate dialect, and
> at what point a unique language? Hmmm.
>
> ChrisA
>


-- 
DaveA



More information about the Python-list mailing list