From al at inventwithpython.com Wed Jan 21 09:03:41 2015 From: al at inventwithpython.com (Al Sweigart) Date: Wed, 21 Jan 2015 21:03:41 +1300 Subject: [Idle-dev] I18N prep work - convert strings to use format() Message-ID: I'd like to change the old-style string formatting from '%s %s' % (foo, bar) to the format() string method version. This is not just a cosmetic change, but would be part of the prep work for I18N support for IDLE. For some languages, the order that variables would need to be interpolated into strings would not be the same. For example, in English you could have: color = 'black' animal = 'cat' "The %s %s's toy." % (color, animal) While in Spanish, this would be: color = 'negro' animal = 'gato' "El juguete del %s %s." % (animal, color) However, this defeats translating the "The %s %s's toy." string for I18N. Using the format() string method fixes this: "The {color} {animal}'s toy." could be translated as "El juguete del {animal} {color}." I'd like to go through the strings in IDLE and convert them to use format(). This wouldn't be all of the strings: only those that are user-facing and have multiple %s (or other) conversion specifiers would need to be translated. This is a change that touches on a lot of files, so I wanted to see if anyone could foresee issues with this change before I start on it. -Al From taleinat at gmail.com Wed Jan 21 16:51:15 2015 From: taleinat at gmail.com (Tal Einat) Date: Wed, 21 Jan 2015 17:51:15 +0200 Subject: [Idle-dev] I18N prep work - convert strings to use format() In-Reply-To: References: Message-ID: On Wed, Jan 21, 2015 at 10:03 AM, Al Sweigart wrote: > > I'd like to change the old-style string formatting from '%s %s' % > (foo, bar) to the format() string method version. This is not just a > cosmetic change, but would be part of the prep work for I18N support > for IDLE. > > For some languages, the order that variables would need to be > interpolated into strings would not be the same. For example, in > English you could have: > > color = 'black' > animal = 'cat' > "The %s %s's toy." % (color, animal) > > While in Spanish, this would be: > > color = 'negro' > animal = 'gato' > "El juguete del %s %s." % (animal, color) > > However, this defeats translating the "The %s %s's toy." string for > I18N. Using the format() string method fixes this: "The {color} > {animal}'s toy." could be translated as "El juguete del {animal} > {color}." > > I'd like to go through the strings in IDLE and convert them to use > format(). This wouldn't be all of the strings: only those that are > user-facing and have multiple %s (or other) conversion specifiers > would need to be translated. > > This is a change that touches on a lot of files, so I wanted to see if > anyone could foresee issues with this change before I start on it. > > -Al Hi Al, If i18n is your goal, are you sure that string.format() is the way to go? string.format()'s syntax was not defined with this use-case in mind. For example, it supports a lot of things which are not relevant for translation, which would leave a lot of room for user errors of the sort that use valid string.format() syntax but produce unwanted results. A more significant drawback is that learning the syntax is unnecessarily difficult for someone who'd just want to make a translation. If i18n is the goal, I strongly suggest using a tool that was made for the purpose. - Tal Einat From tjreedy at udel.edu Wed Jan 21 22:15:42 2015 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 21 Jan 2015 16:15:42 -0500 Subject: [Idle-dev] I18N prep work - convert strings to use format() In-Reply-To: References: Message-ID: On 1/21/2015 10:51 AM, Tal Einat wrote: > On Wed, Jan 21, 2015 at 10:03 AM, Al Sweigart wrote: >> >> I'd like to change the old-style string formatting from '%s %s' % >> (foo, bar) to the format() string method version. This is not just a >> cosmetic change, but would be part of the prep work for I18N support >> for IDLE. Since I currently regard internationalizaiton as pie in the sky, I would tend to see such changes as cosmetic. If I were editing format strings anyway, I might then make the change simply because I prefer it. >> For some languages, the order that variables would need to be >> interpolated into strings would not be the same. For example, in >> English you could have: >> >> color = 'black' >> animal = 'cat' >> "The %s %s's toy." % (color, animal) >> >> While in Spanish, this would be: >> >> color = 'negro' >> animal = 'gato' >> "El juguete del %s %s." % (animal, color) This can also be done with % formats. >>> "The %(color)s %(anim)s's toy." % {'color': 'black', 'anim': 'cat'} "The black cat's toy." >>> "El juguete del %(anim)s %(color)s." % {'color': 'negro', 'anim': 'gato'} 'El juguete del gato negro.' That said, I prefer .format, and also note that the Idle replacement strings (as opposed to the format), should not need translation. >> However, this defeats translating the "The %s %s's toy." string for >> I18N. Using the format() string method fixes this: "The {color} >> {animal}'s toy." could be translated as "El juguete del {animal} >> {color}." >> >> I'd like to go through the strings in IDLE and convert them to use >> format(). This wouldn't be all of the strings: only those that are >> user-facing and have multiple %s (or other) conversion specifiers >> would need to be translated. The only examples I can think of where this applies are warning messages such as warning = ('\n Warning: configHandler.py - IdleConf.GetOption -\n' ' invalid %r value for configuration option %r\n' ' from section %r: %r' % (type, option, section, self.userCfg[configType].Get(section, option, raw=raw))) If Il8n did happen, these would be a low priority, as Il8n advocates have said menu first, then configuration dialogs next. Unlike exception messages, which would not be translated, such messages should hopefully never be seen, and in a classroom, an instructor would need to interpret it anyway as to what to do. When I work on confighandler (which definitely needs work) and review its warning messages, I should like to revise this one (and others like it) to specify the file with the problem, so one would know where to make a fix. For this one, the file depends on the user config directory and the config type. (Telling users where the message comes from is pretty useless.) I might rewrite as warning = ("Config Warning: user configuration file {fname}," "section [{sname}], option {oname} must be type {tname}. " "Entry {uval} is not valid; using backup value instead." .format(fname=, sname=section, oname=option, tname=type, uval=...)) >> This is a change that touches on a lot of files, so I wanted to see if >> anyone could foresee issues with this change before I start on it. Not useful now in itself. > If i18n is your goal, are you sure that string.format() is the way to go? Either kind of format can be translated if one uses field names. If one replaces by position, .format is better since one can switch from '{0} {1}' to '{1} {0}' without changing the .format call. With 4 or 5 fields, as in the real example above, I like field names. For code that should never be run, the run time cost is irrelevant. -- Terry Jan Reedy From al at inventwithpython.com Thu Jan 22 01:46:01 2015 From: al at inventwithpython.com (Al Sweigart) Date: Thu, 22 Jan 2015 13:46:01 +1300 Subject: [Idle-dev] I18N prep work - convert strings to use format() In-Reply-To: References: Message-ID: Just for some clarifications, the conversion to .format() would not do the I18N translation itself. It's simply to set up the underlying strings in a way that makes it possible for the translated strings to have the interpolated parts in a different order. The actual translation stuff will be handled by the gettext module in the standard way. The translator does not have to be a software developer and won't see the .format() code (or any code) at all. Aside: The gettext module adds a _() function to the global namespace, so instead of print('Hello world') you would change the source code to print(_('Hello world')) and gettext has _() return the appropriate string for the language setting. The only thing the translator sees in their software tools is 'Hello world'. I admit this is a low priority and doesn't produce immediate benefit, but it's a task that I'm willing to take on myself to do the work for. After which, I'll continue the bug 17776 work. IDLE I18N is my personal objective, and I don't want to burden or task others with it any more than necessary. I agree that what should be translated are the user-facing UI messages (menus, config, etc). Error messages are low priority. But it's easier to get the strings set up correctly the first time so translators don't have to repeat work. On Thu, Jan 22, 2015 at 10:15 AM, Terry Reedy wrote: > On 1/21/2015 10:51 AM, Tal Einat wrote: >> >> On Wed, Jan 21, 2015 at 10:03 AM, Al Sweigart >> wrote: >>> >>> >>> I'd like to change the old-style string formatting from '%s %s' % >>> (foo, bar) to the format() string method version. This is not just a >>> cosmetic change, but would be part of the prep work for I18N support >>> for IDLE. > > > Since I currently regard internationalizaiton as pie in the sky, I would > tend to see such changes as cosmetic. If I were editing format strings > anyway, I might then make the change simply because I prefer it. > >>> For some languages, the order that variables would need to be >>> interpolated into strings would not be the same. For example, in >>> English you could have: >>> >>> color = 'black' >>> animal = 'cat' >>> "The %s %s's toy." % (color, animal) >>> >>> While in Spanish, this would be: >>> >>> color = 'negro' >>> animal = 'gato' >>> "El juguete del %s %s." % (animal, color) > > > This can also be done with % formats. > >>>> "The %(color)s %(anim)s's toy." % {'color': 'black', 'anim': 'cat'} > "The black cat's toy." >>>> "El juguete del %(anim)s %(color)s." % {'color': 'negro', 'anim': >>>> 'gato'} > 'El juguete del gato negro.' > > That said, I prefer .format, and also note that the Idle replacement strings > (as opposed to the format), should not need translation. > >>> However, this defeats translating the "The %s %s's toy." string for >>> I18N. Using the format() string method fixes this: "The {color} >>> {animal}'s toy." could be translated as "El juguete del {animal} >>> {color}." >>> >>> I'd like to go through the strings in IDLE and convert them to use >>> format(). This wouldn't be all of the strings: only those that are >>> user-facing and have multiple %s (or other) conversion specifiers >>> would need to be translated. > > > The only examples I can think of where this applies are warning messages > such as > > warning = ('\n Warning: configHandler.py - IdleConf.GetOption -\n' > ' invalid %r value for configuration option %r\n' > ' from section %r: %r' % > (type, option, section, > self.userCfg[configType].Get(section, option, > raw=raw))) > > If Il8n did happen, these would be a low priority, as Il8n advocates have > said menu first, then configuration dialogs next. Unlike exception > messages, which would not be translated, such messages should hopefully > never be seen, and in a classroom, an instructor would need to interpret it > anyway as to what to do. > > When I work on confighandler (which definitely needs work) and review its > warning messages, I should like to revise this one (and others like it) to > specify the file with the problem, so one would know where to make a fix. > For this one, the file depends on the user config directory and the config > type. (Telling users where the message comes from is pretty useless.) I > might rewrite as > > warning = ("Config Warning: user configuration file {fname}," > "section [{sname}], option {oname} must be type {tname}. " > "Entry {uval} is not valid; using backup value instead." > .format(fname=, sname=section, oname=option, > tname=type, uval=...)) > >>> This is a change that touches on a lot of files, so I wanted to see if >>> anyone could foresee issues with this change before I start on it. > > > Not useful now in itself. > >> If i18n is your goal, are you sure that string.format() is the way to go? > > > Either kind of format can be translated if one uses field names. If one > replaces by position, .format is better since one can switch from '{0} {1}' > to '{1} {0}' without changing the .format call. With 4 or 5 fields, as in > the real example above, I like field names. For code that should never be > run, the run time cost is irrelevant. > > -- > Terry Jan Reedy > > _______________________________________________ > IDLE-dev mailing list > IDLE-dev at python.org > https://mail.python.org/mailman/listinfo/idle-dev From guido at python.org Thu Jan 22 01:52:14 2015 From: guido at python.org (Guido van Rossum) Date: Wed, 21 Jan 2015 16:52:14 -0800 Subject: [Idle-dev] I18N prep work - convert strings to use format() In-Reply-To: References: Message-ID: I think the string.Template facility was designed with i18n in mind: https://docs.python.org/3/library/string.html?highlight=string.template#string.Template (or the PY2 equivalent). On Wed, Jan 21, 2015 at 4:46 PM, Al Sweigart wrote: > Just for some clarifications, the conversion to .format() would not do > the I18N translation itself. It's simply to set up the underlying > strings in a way that makes it possible for the translated strings to > have the interpolated parts in a different order. The actual > translation stuff will be handled by the gettext module in the > standard way. The translator does not have to be a software developer > and won't see the .format() code (or any code) at all. > > Aside: The gettext module adds a _() function to the global namespace, > so instead of print('Hello world') you would change the source code to > print(_('Hello world')) and gettext has _() return the appropriate > string for the language setting. The only thing the translator sees in > their software tools is 'Hello world'. > > I admit this is a low priority and doesn't produce immediate benefit, > but it's a task that I'm willing to take on myself to do the work for. > After which, I'll continue the bug 17776 work. IDLE I18N is my > personal objective, and I don't want to burden or task others with it > any more than necessary. > > I agree that what should be translated are the user-facing UI messages > (menus, config, etc). Error messages are low priority. But it's easier > to get the strings set up correctly the first time so translators > don't have to repeat work. > > On Thu, Jan 22, 2015 at 10:15 AM, Terry Reedy wrote: > > On 1/21/2015 10:51 AM, Tal Einat wrote: > >> > >> On Wed, Jan 21, 2015 at 10:03 AM, Al Sweigart > >> wrote: > >>> > >>> > >>> I'd like to change the old-style string formatting from '%s %s' % > >>> (foo, bar) to the format() string method version. This is not just a > >>> cosmetic change, but would be part of the prep work for I18N support > >>> for IDLE. > > > > > > Since I currently regard internationalizaiton as pie in the sky, I would > > tend to see such changes as cosmetic. If I were editing format strings > > anyway, I might then make the change simply because I prefer it. > > > >>> For some languages, the order that variables would need to be > >>> interpolated into strings would not be the same. For example, in > >>> English you could have: > >>> > >>> color = 'black' > >>> animal = 'cat' > >>> "The %s %s's toy." % (color, animal) > >>> > >>> While in Spanish, this would be: > >>> > >>> color = 'negro' > >>> animal = 'gato' > >>> "El juguete del %s %s." % (animal, color) > > > > > > This can also be done with % formats. > > > >>>> "The %(color)s %(anim)s's toy." % {'color': 'black', 'anim': 'cat'} > > "The black cat's toy." > >>>> "El juguete del %(anim)s %(color)s." % {'color': 'negro', 'anim': > >>>> 'gato'} > > 'El juguete del gato negro.' > > > > That said, I prefer .format, and also note that the Idle replacement > strings > > (as opposed to the format), should not need translation. > > > >>> However, this defeats translating the "The %s %s's toy." string for > >>> I18N. Using the format() string method fixes this: "The {color} > >>> {animal}'s toy." could be translated as "El juguete del {animal} > >>> {color}." > >>> > >>> I'd like to go through the strings in IDLE and convert them to use > >>> format(). This wouldn't be all of the strings: only those that are > >>> user-facing and have multiple %s (or other) conversion specifiers > >>> would need to be translated. > > > > > > The only examples I can think of where this applies are warning messages > > such as > > > > warning = ('\n Warning: configHandler.py - IdleConf.GetOption -\n' > > ' invalid %r value for configuration option %r\n' > > ' from section %r: %r' % > > (type, option, section, > > self.userCfg[configType].Get(section, option, > > raw=raw))) > > > > If Il8n did happen, these would be a low priority, as Il8n advocates have > > said menu first, then configuration dialogs next. Unlike exception > > messages, which would not be translated, such messages should hopefully > > never be seen, and in a classroom, an instructor would need to interpret > it > > anyway as to what to do. > > > > When I work on confighandler (which definitely needs work) and review its > > warning messages, I should like to revise this one (and others like it) > to > > specify the file with the problem, so one would know where to make a fix. > > For this one, the file depends on the user config directory and the > config > > type. (Telling users where the message comes from is pretty useless.) I > > might rewrite as > > > > warning = ("Config Warning: user configuration file {fname}," > > "section [{sname}], option {oname} must be type {tname}. " > > "Entry {uval} is not valid; using backup value instead." > > .format(fname=, sname=section, oname=option, > > tname=type, uval=...)) > > > >>> This is a change that touches on a lot of files, so I wanted to see if > >>> anyone could foresee issues with this change before I start on it. > > > > > > Not useful now in itself. > > > >> If i18n is your goal, are you sure that string.format() is the way to > go? > > > > > > Either kind of format can be translated if one uses field names. If one > > replaces by position, .format is better since one can switch from '{0} > {1}' > > to '{1} {0}' without changing the .format call. With 4 or 5 fields, as > in > > the real example above, I like field names. For code that should never > be > > run, the run time cost is irrelevant. > > > > -- > > Terry Jan Reedy > > > > _______________________________________________ > > IDLE-dev mailing list > > IDLE-dev at python.org > > https://mail.python.org/mailman/listinfo/idle-dev > _______________________________________________ > IDLE-dev mailing list > IDLE-dev at python.org > https://mail.python.org/mailman/listinfo/idle-dev > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: