From jim@zope.com Fri Feb 21 17:51:18 2003 From: jim@zope.com (Jim Fulton) Date: Fri, 21 Feb 2003 12:51:18 -0500 Subject: [I18n-sig] I18n sprints Message-ID: <3E566716.1000402@zope.com> Hi, I'd like to invite people from this list to participate in some sprints scheduled to focus on I18n/L10n for Zope and other Python applications. A sprint is a multi-day focused development session, in which developers pair in a room and focus on building a particular subsystem. A sprint is organized with a coach leading the session. The coach sets the agenda, tracks activities, and keeps the development moving. The developers work in pairs using XP's pair programming approach. The sprint approach works best when the first few hours are spent getting oriented -- presenting a tutorial for the development material, laying out the stories to tackle for the day, getting everyone a CVS checkout to work with. See also the section "Sprinting Explained" at the bottom of this page: http://dev.zope.org/Wikis/DevSite/Projects/ComponentArchitecture/SprintSchedule The two sprints are: - PyCon, http://www.python.org/pycon/, sprint, March 24-25, 2003 in Washington D.C.. See http://dev.zope.org/Zope3/PyConSprint. - Louvain-la-neuve Sprint April 8-11, 2003, in Louvain-la-neuve, Belgium. See http://dev.zope.org/Zope3/LouvainLaNeuveSprint. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (888) 344-4332 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From bruno@clisp.org Mon Feb 24 13:31:42 2003 From: bruno@clisp.org (Bruno Haible) Date: Mon, 24 Feb 2003 14:31:42 +0100 (CET) Subject: [I18n-sig] bugs in gettext.py plural handling Message-ID: <15962.7870.534282.204499@honolulu.ilog.fr> Hi, Testing GNU gettext's integration test with Python 2.3a2, I see that there are several bugs relating to plural forms and the ngettext functi= on. 1) $ python import gettext germanic =3D gettext.c2py('!(n =3D=3D 1)') Traceback (most recent call last): File "", line 1, in ? File "/packages/gnu-inst-python/2.3a2/lib/python2.3/gettext.py", line= 110, in c2py stack[-1] +=3D '(%s)' % s IndexError: list index out of range The ! operator is treated incorrectly if not followed by a space. Here is a fix. *** gettext.py.bak=092003-02-22 02:28:17.000000000 +0100 --- gettext.py=092003-02-22 21:37:33.000000000 +0100 *************** *** 88,95 **** plural =3D plural.replace('&&', ' and ') plural =3D plural.replace('||', ' or ') =20 ! expr =3D re.compile(r'\![^=3D]') ! plural =3D expr.sub(' not ', plural) =20 # Regular expression and replacement function used to transform # "a?b:c" to "test(a,b,c)". --- 88,95 ---- plural =3D plural.replace('&&', ' and ') plural =3D plural.replace('||', ' or ') =20 ! expr =3D re.compile(r'\!([^=3D])') ! plural =3D expr.sub(' not \\1', plural) =20 # Regular expression and replacement function used to transform # "a?b:c" to "test(a,b,c)". 2) Unbalanced parentheses in a plural expression don't give an error 'unbalanced parenthesis in plural form'. Example: $ python import gettext germanic =3D gettext.c2py('n =3D)=3D 1') Instead we get an weird error message tokenize.TokenError: ('EOF in multi-line statement', (2, 0)) Furthermore even if this error were avoided, we would get IndexError: list index out of range Here is a fix for the second half of this bug. I don't know Python enough to fix the first half as well. *** gettext.py.bak=092003-02-22 02:28:17.000000000 +0100 --- gettext.py=092003-02-22 21:37:33.000000000 +0100 *************** *** 104,110 **** if c =3D=3D '(': stack.append('') elif c =3D=3D ')': ! if len(stack) =3D=3D 0: raise ValueError, 'unbalanced parenthesis in plural f= orm' s =3D expr.sub(repl, stack.pop()) stack[-1] +=3D '(%s)' % s --- 104,110 ---- if c =3D=3D '(': stack.append('') elif c =3D=3D ')': ! if len(stack) =3D=3D 1: raise ValueError, 'unbalanced parenthesis in plural f= orm' s =3D expr.sub(repl, stack.pop()) stack[-1] +=3D '(%s)' % s 3) Here's my test code (in ISO-8859-1): =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D prog.py= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D import sys import gettext n =3D int(sys.argv[1]) gettext.textdomain('prog') gettext.bindtextdomain('prog', '.') print gettext.gettext("'Your command, please?', asked the waiter.") print gettext.ngettext("a piece of cake","%(count)d pieces of cake",n) = \ % { 'count': n } print gettext.gettext("%(oldCurrency)s is replaced by %(newCurrency)s."= ) \ % { 'oldCurrency': "FF", 'newCurrency' : "EUR" } =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D f= r.po =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D msgid "" msgstr "" "Content-Type: text/plain; charset=3DISO-8859-1\n" "Plural-Forms: nplurals=3D2; plural=3D(n > 1);\n" msgid "'Your command, please?', asked the waiter." msgstr "=ABVotre commande, s'il vous plait=BB, dit le gar=E7on." # Les gateaux allemands sont les meilleurs du monde. #, python-format msgid "a piece of cake" msgid_plural "%(count)d pieces of cake" msgstr[0] "un morceau de gateau" msgstr[1] "%(count)d morceaux de gateau" # Reverse the arguments. #, python-format msgid "%(oldCurrency)s is replaced by %(newCurrency)s." msgstr "%(newCurrency)s remplace %(oldCurrency)s." =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D $ mkdir -p fr/LC_MESSAGES $ msgfmt -o fr/LC_MESSAGES/prog.mo fr.po $ LANGUAGE=3D LC_ALL=3Dfr_FR python prog.py 2 =ABVotre commande, s'il vous plait=BB, dit le gar=E7on. Traceback (most recent call last): File "prog.py", line 10, in ? print gettext.ngettext("a piece of cake","%(count)d pieces of cake"= ,n) \ File "/packages/gnu-inst-python/2.3a2/lib/python2.3/gettext.py", line= 445, in ngettext return dngettext(_current_domain, msgid1, msgid2, n) File "/packages/gnu-inst-python/2.3a2/lib/python2.3/gettext.py", line= 437, in dngettext return t.ngettext(msgid1, msgid2, n) File "/packages/gnu-inst-python/2.3a2/lib/python2.3/gettext.py", line= 294, in ngettext return self._catalog[(msgid1, self.plural(n))] AttributeError: GNUTranslations instance has no attribute 'plural' Why does it have no 'plural' attribute? * Testing that the header entry starts with 'Project-Id-Version:' is not appropriate because it excludes valid header entries. The gettext tools may remove or move this line in future versions. * libintl and msgfmt assume a fallback of "n !=3D 1" if no Plural-Forms= : entry is provided. In the same way, self.plural should use "n !=3D 1" a= s a fallback. Here is the fix for both. *** gettext.py.bak=092003-02-22 02:28:17.000000000 +0100 --- gettext.py=092003-02-22 21:37:33.000000000 +0100 *************** *** 114,119 **** --- 114,121 ---- =20 return eval('lambda n: int(%s)' % plural) =20 + _germanic_plural =3D lambda n: int(n !=3D 1) +=20 =20 =20 def _expand_lang(locale): *************** *** 225,230 **** --- 227,233 ---- # Parse the .mo file header, which consists of 5 little endia= n 32 # bit words. self._catalog =3D catalog =3D {} + self.plural =3D _germanic_plural buf =3D fp.read() buflen =3D len(buf) # Are we big endian or little endian? *************** *** 258,264 **** else: raise IOError(0, 'File is corrupt', filename) # See if we're looking at GNU .mo conventions for metadat= a ! if mlen =3D=3D 0 and tmsg.lower().startswith('project-id-= version:'): # Catalog description for item in tmsg.split('\n'): item =3D item.strip() --- 261,267 ---- else: raise IOError(0, 'File is corrupt', filename) # See if we're looking at GNU .mo conventions for metadat= a ! if mlen =3D=3D 0: # Catalog description for item in tmsg.split('\n'): item =3D item.strip() 4) Btw, I have to correct a misimpression. It was claimed in http://mail.python.org/pipermail/i18n-sig/2002-November/001514.html that GNU xgettext 0.11.5 doesn't support ngettext in Python. But it doe= s if you add the command line options "-kgettext -kngettext:1,2". The rea= son is that when xgettext 0.11.5 was released, Python didn't have the ngett= ext function, and noone told me that it would. So for example, $ xgettext -kgettext -kngettext:1,2 -o - prog.py produces the .pot file for prog.py above. Bruno From Tex Texin Fri Feb 28 05:44:59 2003 From: Tex Texin (Tex Texin) Date: Fri, 28 Feb 2003 00:44:59 -0500 Subject: [I18n-sig] Unicode Conference Early Bird rates about to expire Message-ID: <3E5EF75B.D511E039@i18nguy.com> March 1 is the deadline to get a discount on the conference and hotel rates, so register now! Register now! Don't miss out on early bird conference and hotel rates! ************************************************************************* Twenty-third Internationalization and Unicode Conference (IUC23) > Early bird registration rate valid to March 1. > Hotel guest room group rate valid to March 1. Thereafter reservations accepted on space and rate availability. ************************************************************************** Twenty-third Internationalization and Unicode Conference (IUC23) Unicode, Internationalization, the Web: The Global Connection http://www.unicode.org/iuc/iuc23 March 24-26, 2003 Prague, Czech Republic ************************************************************************* NEWS > Check out the updated Conference program and register now via the Conference Web site ( http://www.unicode.org/iuc/iuc23 ). The web site includes abstracts of talks and speakers' biographies so you can see the industry leaders that will be there and the hot topics for internationalization and Unicode in 2003! > Sign up for the Workshop on Managing Localization Projects, organized by XenCraft, and taking place in the same venue on 27 March -- See: http://www.unicode.org/iuc/iuc23 > Attend the new Showcase to find out more about products supporting the Unicode Standard, and products and services that can help you globalize/localize your software, documentation and Internet content. > Be an Exhibitor! Show off your product at the premier technical conference worldwide for both software and Web internationalization. See: http://www.unicode.org/iuc/iuc23/showcase.html CONFERENCE PROGRAM The conference features tutorials, lectures, and panel discussions that provide coverage of standards, best practices, and recent advances in the globalization of software and the Internet. See the program: http://www.unicode.org/iuc/iuc23/program.html GLOBAL COMPUTING SHOWCASE For the first time, we will have an Exhibitors' track as part of the Conference, in addition to the updated Showcase Exhibition. For more information, please visit the Web site at: http://www.unicode.org/iuc/iuc23/showcase.html Showcase participants include: Agfa Monotype Corporation Alchemy Software Development Ltd. Basis Technology Corporation Moravia IT Multilingual Computing, Inc. Sun Microsystems, Inc. Don't be left out!! Sign up now! CONFERENCE VENUE The Conference will take place in lovely, historic Prague: Marriott Prague Hotel V Celnici 8 Prague, 110 00 Czech Republic Tel: (+420 2) 2288 8888 Fax: (+420 2) 2288 8889 CONFERENCE SPONSORS Agfa Monotype Corporation Basis Technology Corporation Microsoft Corporation Moravia IT Sun Microsystems, Inc. World Wide Web Consortium (W3C) CONFERENCE MANAGEMENT Global Meeting Services Inc. 8949 Lombard Place, #416 San Diego, CA 92122, USA Tel: +1 858 638 0206 (voice) +1 858 638 0504 (fax) Email: info@global-conference.com or: conference@unicode.org THE UNICODE CONSORTIUM The Unicode Consortium was founded as a non-profit organization in 1991. It is dedicated to the development, maintenance and promotion of The Unicode Standard, a worldwide character encoding. The Unicode Standard encodes the characters of the world's principal scripts and languages, and is code-for-code identical to the international standard ISO/IEC 10646. In addition to cooperating with ISO on the future development of ISO/IEC 10646, the Consortium is responsible for providing character properties and algorithms for use in implementations. Today the membership base of the Unicode Consortium includes major computer corporations, software producers, database vendors, research institutions, international agencies and various user groups. For further information on the Unicode Standard, visit the Unicode Web site at http://www.unicode.org or e-mail * * * * * Unicode(r) and the Unicode logo are registered trademarks of Unicode, Inc. Used with permission.