From diedrich@xmission.com  Tue Jan  1 19:28:39 2002
From: diedrich@xmission.com (Karl T. Diedrich)
Date: Tue, 1 Jan 2002 13:28:39 -0600
Subject: [I18n-sig] python gettext example?
Message-ID: <200201011928.g01JSdm02291@hanguk.homenet.org>

Hello,
Are there an examples of Python programs using gettext. I am trying to work 
up a simple example so I can internationalize an open source Python project I 
have. 

I think I know how to prepare the source code prepared but I don't understand 
how it all works together.

I put this at the top of files:

from gettext import gettext as _
from gettext import bindtextdomain, textdomain
from os import sep
from locale import setlocale, LC_ALL

LOCALE_PREFIX = "%susr" % (sep)
LOCALE_DIR = "%s%sshare%slocale" % ( LOCALE_PREFIX, sep, sep )
PACKAGE = "deodas"

setlocale( LC_ALL )
bindtextdomain( PACKAGE, LOCALE_DIR )
textdomain( PACKAGE )

# in the code
string = _( u'A sentance to translate.' )

The source code looks like 
module/__init__.py 
module/etc...
po/POTFILES.in
po/es.po
po/es.mo

I used 
	pygettext filename  &
	msgfmt.py es.po 
to pull out the strings and make the translation catalog.  

How do I run the program using a translation file?

-- 
Karl Diedrich
http://deodas.sourceforge.net/


From martin@v.loewis.de  Tue Jan  1 21:29:30 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: Tue, 1 Jan 2002 22:29:30 +0100
Subject: [I18n-sig] python gettext example?
In-Reply-To: <200201011928.g01JSdm02291@hanguk.homenet.org>
 (diedrich@xmission.com)
References: <200201011928.g01JSdm02291@hanguk.homenet.org>
Message-ID: <200201012129.g01LTUJ13439@mira.informatik.hu-berlin.de>

> Are there an examples of Python programs using gettext. 

mailman does.

> I put this at the top of files:
> 
> from gettext import gettext as _
> from gettext import bindtextdomain, textdomain
> from os import sep
> from locale import setlocale, LC_ALL
> 
> LOCALE_PREFIX = "%susr" % (sep)
> LOCALE_DIR = "%s%sshare%slocale" % ( LOCALE_PREFIX, sep, sep )

If you install into the system locale dir (i.e. the one of the python
prefix), you don't need to bind the text domain; the default search
path should be fine.

> PACKAGE = "deodas"
> 
> setlocale( LC_ALL )

This has no effect. To set the locale, use

setlocale(LC_ALL, "")

OTOH, gettext.py will work even without a setlocale call.

> The source code looks like 
> module/__init__.py 
> module/etc...
> po/POTFILES.in
> po/es.po
> po/es.mo
[...]
> How do I run the program using a translation file?

You should install the files, into
/usr/share/locale/es/LC_MESSAGES/deodas.mo. Alternatively, you could 
install them anywhere else, e.g.

/tmp/LC_MESSAGES/deodas.mo

then you should set LOCALE_DIR to /tmp. If you want to use the locale
files right in their soure location, you should do

trans = gettext.GNUTranslation(open("po/es.mo"))
_ = trans.gettext

HTH,
Martin


From jdavid@nuxeo.com  Thu Jan  3 17:41:23 2002
From: jdavid@nuxeo.com (Juan David =?ISO-8859-1?Q?Ib=E1=F1ez?= Palomar)
Date: Thu, 03 Jan 2002 18:41:23 +0100
Subject: [I18n-sig] Normal and unicode strings
Message-ID: <3C3497C3.2040905@nuxeo.com>

Hi all,

I've started to look at Unicode..

There're two types of strings in Python, 'str' and 'unicode'.
I guess there're technical reasons to have two different
classes. Please, could somebody explain me these reasons?
(or tell me where this is documented). Please, keep in mind
that I've never looked at the Python sources and I'm still
quite ignorant about Unicode.

I think that for the user (the Python programmer) it would
be better to have only one class of strings, if possible of
course. Is there any chance that this will be addressed in
future versions of Python? Something similar to the unification
of integers and long integers. I haven't found anything in
the index of PEPs.


Many thanks for your time,
jdavid


From martin@v.loewis.de  Thu Jan  3 22:10:18 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: Thu, 3 Jan 2002 23:10:18 +0100
Subject: [I18n-sig] Normal and unicode strings
In-Reply-To: <3C3497C3.2040905@nuxeo.com> (message from Juan David
 =?ISO-8859-1?Q?Ib=E1=F1ez?= Palomar on Thu, 03 Jan 2002 18:41:23	+0100)
References: <3C3497C3.2040905@nuxeo.com>
Message-ID: <200201032210.g03MAIp01523@mira.informatik.hu-berlin.de>


From martin@v.loewis.de  Thu Jan  3 22:20:12 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: Thu, 3 Jan 2002 23:20:12 +0100
Subject: [I18n-sig] Normal and unicode strings
In-Reply-To: <3C3497C3.2040905@nuxeo.com> (message from Juan David
 =?ISO-8859-1?Q?Ib=E1=F1ez?= Palomar on Thu, 03 Jan 2002 18:41:23	+0100)
References: <3C3497C3.2040905@nuxeo.com>
Message-ID: <200201032220.g03MKCr01551@mira.informatik.hu-berlin.de>

> I've started to look at Unicode..
> 
> There're two types of strings in Python, 'str' and 'unicode'.
> I guess there're technical reasons to have two different
> classes. Please, could somebody explain me these reasons?

Strings, traditionally, have been used for two things:

- byte strings, as you get them when reading from a file or a network
  connection, or interacting with the operating system in a variety of
  other ways, and

- character strings, to represent text - typically intended for the
  eventual display to the user using glyphs in some font.

Notice that both uses of strings are equally important. If you
disagree, just consider how you would do things like bitmaps (GIF
files, JPEG files, video streams) or networking protocols (like HTTP
or NFS) without byte strings.

It turns out that there is no meaningful way to support both
simultaneously. To support bytes properly (including the C API), you
really need the property that each element has 256 values which form a
contiguous block in your computer's memory. To support character
strings properly, you need much more than 256 values. Unicode is an
international standard that associated well-defined meanings with more
than 100,000 of these values, so that all languages can represent all
characters in a single character set.

> Please, keep in mind that I've never looked at the Python sources
> and I'm still quite ignorant about Unicode.

If you really want to get familiar with Unicode, the Python
documentation alone is the wrong place. Please refer to
www.unicode.org; they recommend to by their book, but have a lot of
introductory material also.

> I think that for the user (the Python programmer) it would
> be better to have only one class of strings, if possible of
> course. 

No. The user should be always aware whether what he has is a byte
string or a character string. For byte strings, the type name 'str'
should be used; for character strings, the type named 'unicode' is
good.

> Is there any chance that this will be addressed in future versions
> of Python?

Perhaps, but it is unclear how this could work. Most likely, string
literals would mean "character string", but then people that want to
have byte string literals will complain - even the standard library
uses both byte string literals and character string literals, without
distinguishing between them.

There is a patch on SF proposing a migration strategy: First introduce
the notion of byte string literals (b'HTTP/1.0'), then, years later,
consider changing the meaning of plain strings to mean Unicode.

Regards,
Martin


From mal@lemburg.com  Fri Jan  4 09:25:28 2002
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 04 Jan 2002 10:25:28 +0100
Subject: [I18n-sig] Normal and unicode strings
References: <3C3497C3.2040905@nuxeo.com> <200201032220.g03MKCr01551@mira.informatik.hu-berlin.de>
Message-ID: <3C357508.6073EB94@lemburg.com>

"Martin v. Loewis" wrote:
> 
> > Please, keep in mind that I've never looked at the Python sources
> > and I'm still quite ignorant about Unicode.
> 
> If you really want to get familiar with Unicode, the Python
> documentation alone is the wrong place. Please refer to
> www.unicode.org; they recommend to by their book, but have a lot of
> introductory material also.

You might also want to take a look at the slides I have on
the Python Software pages (see link in sig): I gave a talk
about Unicode and Python at the Bordeaux conference last year.
 
-- 
Marc-Andre Lemburg
CEO eGenix.com Software GmbH
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                   http://www.egenix.com/files/python/


From jdavid@nuxeo.com  Fri Jan  4 18:58:21 2002
From: jdavid@nuxeo.com (Juan David =?ISO-8859-1?Q?Ib=E1=F1ez?= Palomar)
Date: Fri, 04 Jan 2002 19:58:21 +0100
Subject: [I18n-sig] Normal and unicode strings
References: <3C3497C3.2040905@nuxeo.com> <200201032220.g03MKCr01551@mira.informatik.hu-berlin.de> <3C357508.6073EB94@lemburg.com>
Message-ID: <3C35FB4D.2060108@nuxeo.com>

Many thanks Martin and Marc-Andre for helping me with
with this, now I understand it better.

That's all for now. Cheers.


M.-A. Lemburg wrote:

>"Martin v. Loewis" wrote:
>
>>>Please, keep in mind that I've never looked at the Python sources
>>>and I'm still quite ignorant about Unicode.
>>>
>>If you really want to get familiar with Unicode, the Python
>>documentation alone is the wrong place. Please refer to
>>www.unicode.org; they recommend to by their book, but have a lot of
>>introductory material also.
>>
>
>You might also want to take a look at the slides I have on
>the Python Software pages (see link in sig): I gave a talk
>about Unicode and Python at the Bordeaux conference last year.
> 
>

-- 
J. David Ibáñez, Nuxeo.com
Python programmer (http://www.python.org)


From Misha.Wolf@reuters.com  Fri Jan  4 22:06:52 2002
From: Misha.Wolf@reuters.com (Misha.Wolf@reuters.com)
Date: Fri, 04 Jan 2002 22:06:52 +0000
Subject: [I18n-sig] Last Call for Papers - 21st Unicode Conference - May 2002 - Dublin,
 Ireland
Message-ID: <T583fea1582c407b7074a8@reuters.com>

>>>>>>>>>>>>>>>>>>>>>>>  Last Call for Papers!  <<<<<<<<<<<<<<<<<<<<<<<

         Twenty-First International Unicode Conference (IUC21)
        Unicode, Localization and the Web: The Global Connection
                    http://www.unicode.org/iuc/iuc21
                            May 14-17, 2002
                            Dublin, Ireland

>>>>>>>>>>>>>>>>>>>>>>>>>  Just 1 week to go!  <<<<<<<<<<<<<<<<<<<<<<<<

                  Submissions due: January 11, 2002
                  Notification date: February 1, 2002
                Completed papers due : February 22, 2002
            (in electronic form and camera-ready paper form)

>>>>>>>>>>>>>>>>>>>  Send in your submission now!  <<<<<<<<<<<<<<<<<<<<

The Unicode Standard has become the foundation for all modern text
processing.  It is used on large machines, tiny portable devices, and
for distributed processing across the Internet.  The standard brings
cost-reducing efficiency to international applications and enables the
exchange of text in an ever increasing list of natural languages.

New technologies and innovative Internet applications, as well as the
evolving Unicode Standard, bring new challenges along with their new
capabilities.  This technical conference will explore the opportunities
created by the latest advances and how to leverage them, as well as
potential pitfalls to be aware of, and problem areas that need further
research.

We invite you to submit papers which either define the software of
tomorrow, demonstrate best practice with today's software, or articulate
problems that must be solved before further advances can occur.  Papers
should discuss subjects in the context of Unicode, internationalization
or localization. You can view the programs of previous conferences at:
http://www.unicode.org/unicode/conference/about-conf.html

Conference attendees are generally involved in either the development,
deployment or use of Unicode software or content, or the globalization
of software and the Internet.  They include managers, software
engineers, systems analysts, font designers, graphic designers, content
developers, technical writers, and product marketing personnel.

THEME & TOPICS

Computing with Unicode is the overall theme of the Conference.
Presentations should be geared towards a technical audience.  Topics of
interest include, but are not limited to, the following (within the
context of Unicode, internationalization or localization):

- UTFs: Not enough or too many?
- Security concerns e.g. Avoiding the spoofing of UTF-8 data
- Impact of new encoding standards
- Implementing Unicode: Practical and political hurdles
- Portable devices
- Implementing new features of recent versions of Unicode
- Algorithms (e.g. normalization, collation, bidirectional)
- Programming languages and libraries (Java, Perl, et al)
- The World Wide Web (WWW)
- Search engines
- Library and archival concerns
- Operating systems
- Databases
- Large scale networks
- Government applications
- Evaluations (case studies, usability studies)
- Natural language processing
- Migrating legacy applications
- Cross platform issues
- Printing and imaging
- Optimizing performance of systems and applications
- Testing applications
- XML and Web protocols
- Business models for software development (e.g. Open source)

SESSIONS

The Conference Program will provide a wide range of sessions including:
- Keynote presentations
- Workshops/Tutorials
- Technical presentations
- Panel sessions

All sessions except the Workshops/Tutorials will be of 40 minute
duration.  In some cases, two consecutive 40 minute program slots may be
devoted to a single session.

The Workshops/Tutorials will each last approximately three hours.  They
should be designed to stimulate discussion and participation, using
slides and demonstrations.

PUBLICITY

If your paper is accepted, your details will be included in the
Conference brochure and Web pages and the paper itself will appear on a
Conference CD, with an optional printed book of Conference Proceedings.

CONFERENCE LANGUAGE

The Conference language is English.  All submissions, papers and
presentations should be provided in English.

SUBMISSIONS

Submissions MUST contain:

1. An abstract of 150-250 words, consisting of statement of purpose,
   paper description, and your conclusions or final summary.

2. A brief biography.

3. The details listed below:

   SESSION TITLE:             _________________________________________

                              _________________________________________

   TITLE (eg Dr/Mr/Mrs/Ms):   _________________________________________

   NAME:                      _________________________________________

   JOB TITLE:                 _________________________________________

   ORGANIZATION/AFFILIATION:  _________________________________________

   ORGANIZATION'S WWW URL:    _________________________________________

   OWN WWW URL:               _________________________________________

   ADDRESS FOR PAPER MAIL:    _________________________________________

                              _________________________________________

                              _________________________________________

   TELEPHONE:                 _________________________________________

   FAX:                       _________________________________________

   E-MAIL ADDRESS:            _________________________________________

   TYPE OF SESSION:           [ ] Keynote presentation

                              [ ] Workshop/Tutorial

                              [ ] Technical presentation

                              [ ] Panel

   PANELISTS (if Panel):      _________________________________________

                              _________________________________________

                              _________________________________________

                              _________________________________________

                              _________________________________________

                              _________________________________________

                              _________________________________________

                              _________________________________________

   TARGET AUDIENCE (you may select more than one category):

                              [ ] Content Developers

                              [ ] Font Designers

                              [ ] Graphic Designers

                              [ ] Managers

                              [ ] Marketers

                              [ ] Software Engineers

                              [ ] Systems Analysts

                              [ ] Technical Writers

                              [ ] Others (please specify):

                              _________________________________________

                              _________________________________________

   LEVEL OF SESSION (you may select more than one category):

                              [ ] Beginner

                              [ ] Intermediate

                              [ ] Advanced

Submissions should be sent by e-mail to either of the following
addresses:

   papers@unicode.org

   info@global-conference.com

They should use ASCII, non-compressed text and the following subject
line:

   Proposal for IUC 21

If desired, a copy of the submission may also be sent by post to:

   21st International Unicode Conference
   c/o Global Meeting Services, Inc.
   8949 Lombard Place #416
   San Diego, CA  92122  USA
   Tel: +1 858 638 0206
   Fax: +1 858 638 0504

CONFERENCE PROCEEDINGS

All Conference papers will be published on CD.  Printed proceedings will
be offered as an option.

EXHIBIT OPPORTUNITIES

The Conference will have an Exhibition area for corporations or
individuals who wish to display and promote their products, technology
and/or services.

Every effort will be made to provide maximum exposure and advertising.

Exhibit space is limited.  For further information or to reserve a
place, please contact Global Meeting Services at the above location.

CONFERENCE VENUE

   The Burlington Hotel
   Upper Leeson Street
   Dublin 4
   Ireland

   Tel:  +353 1 660 5222
   Fax:  +353 1 660 8496

THE UNICODE CONSORTIUM

The Unicode Consortium was founded as a non-profit organization in 1991.
It is dedicated to the development, maintenance and promotion of The
Unicode Standard, a worldwide character encoding.  The Unicode Standard
encodes the characters of the world's principal scripts and languages,
and is code-for-code identical to the international standard ISO/IEC
10646.  In addition to cooperating with ISO on the future development of
ISO/IEC 10646, the Consortium is responsible for providing character
properties and algorithms for use in implementations.  Today the
membership base of the Unicode Consortium includes major computer
corporations, software producers, database vendors, research
institutions, international agencies and various user groups.

For further information on the Unicode Standard, visit the Unicode Web
site at http://www.unicode.org or e-mail <info@unicode.org>

                           *  *  *  *  *

Unicode(r) and the Unicode logo are registered trademarks of Unicode,
Inc.  Used with permission.


------------------------------------------------------------- ---
        Visit our Internet site at http://www.reuters.com

Any views expressed in this message are those of  the  individual
sender,  except  where  the sender specifically states them to be
the views of Reuters Ltd.


From Misha.Wolf@reuters.com  Mon Jan  7 22:34:44 2002
From: Misha.Wolf@reuters.com (Misha.Wolf@reuters.com)
Date: Mon, 07 Jan 2002 22:34:44 +0000
Subject: [I18n-sig] 20th Unicode Conference, Jan 2002, Washington DC -- Three weeks to go!
Message-ID: <T584f76ca63c407b706534@reuters.com>

>>>>>>>>>>>>>>>>>>>>>>>>  Just 3 weeks to go!  <<<<<<<<<<<<<<<<<<<<<<<<

           Twentieth International Unicode Conference (IUC20)
               Unicode and the Web: The Global Connection
                    http://www.unicode.org/iuc/iuc20
                          January 28-31, 2002
                          Washington, DC, USA

>>>>>>>>>>>>>>>>>>>>>>>>>>>  Register now!  <<<<<<<<<<<<<<<<<<<<<<<<<<<

NEWS

 * Hotel guest room group rate extended to January 10!

 * Early bird registration rate extended to January 18!

 * Visit the Conference Web site ( http://www.unicode.org/iuc/iuc20 )
   to check the updated Conference program and register.  To help you
   choose Conference sessions, we've included abstracts of talks and
   speakers' biographies.

 * The World Wide Web Consortium (W3C) Internationalization Workshop is
   taking place in the same venue, on February 1 -- See the Call for
   Participation ( http://www.w3.org/2002/02/01-i18n-workshop/cfp )

CONFERENCE SPONSORS

   Agfa Monotype Corporation
   Basis Technology Corporation
   Microsoft Corporation
   Netscape Communications
   Oracle Corporation
   Progress Software Corporation
   Reuters Ltd.
   Sun Microsystems, Inc.
   World Bank
   World Wide Web Consortium (W3C)

CONFERENCE VENUE

   Omni Shoreham Hotel
   2500 Calvert Street, NW
   Washington, DC  20008
   USA

   Tel: +1 202 234 0700
   Fax: +1 202 265 7972

GLOBAL COMPUTING SHOWCASE

   Visit the Showcase to find out more about products supporting the
   Unicode Standard, and products and services that can help you
   globalize/localize your software, documentation and Internet
   content.  For details, visit the Conference Web site:
     http://www.unicode.org/iuc/iuc20

   Exhibitors to date include:

   * Agfa/Monotype Corporation
   * Basis Technology Corporation
   * InfoTech
   * Language Technology Research Center
   * Multilingual Computing, Inc.
   * Rasmussen Software, Inc.
   * SymbioSys, Inc.

CONFERENCE MANAGEMENT

   Global Meeting Services Inc.
   8949 Lombard Place #416
   San Diego, CA 92122, USA

   Tel: +1 858 638 0206 (voice)
        +1 858 638 0504 (fax)

   Email: info@global-conference.com
      or: conference@unicode.org

                             *  *  *  *  *

Unicode(r) and the Unicode logo are registered trademarks of Unicode,
Inc.  Used with permission.


-------------------------------------------------------------- --
        Visit our Internet site at http://www.reuters.com

Any views expressed in this message are those of  the  individual
sender,  except  where  the sender specifically states them to be
the views of Reuters Ltd.


From martin@v.loewis.de  Tue Jan  8 21:32:18 2002
From: martin@v.loewis.de (Martin v. Loewis)
Date: Tue, 8 Jan 2002 22:32:18 +0100
Subject: [I18n-sig] Bug #500595
Message-ID: <200201082132.g08LWIH02077@mira.informatik.hu-berlin.de>

In 

http://sourceforge.net/tracker/index.php?func=detail&aid=500595&group_id=5470&atid=105470

the submitter reports that gettext.install fails if no catalog is
found. This is undesirable. Instead, it should fallback to installing
a _ function that is the identity mapping.

I propose the following strategy to fix that, both in 2.2.1, and 2.3:

- add a parameter fallback= to gettext.translation which, if set to
  true, returns a NullTranslation if no translation can be located,
  instead of raising an exception.

- in gettext.install, call translation() with fallback=1.

What do you think? Would it be also be acceptable to reverse the
behaviour, making fallback=0 the default (so that you'll have to
explicitly request the exception, instead of requesting the null
translation)

I'll also like to implement a per-message fallback mechanism, but that
is certainly for 2.3 (and out of scope of this report).

Regards,
Martin


From Misha.Wolf@reuters.com  Fri Jan 18 19:51:29 2002
From: Misha.Wolf@reuters.com (Misha.Wolf@reuters.com)
Date: Fri, 18 Jan 2002 19:51:29 +0000
Subject: [I18n-sig] 20th Unicode Conference, Jan 2002, Washington DC -- Just 1 week to go!
Message-ID: <T5887908331c407b70717c@reuters.com>

>>>>>>>>>>>>>>>>>>>>>>>>>  Just 1 week to go!  <<<<<<<<<<<<<<<<<<<<<<<<

           Twentieth International Unicode Conference (IUC20)
               Unicode and the Web: The Global Connection
                    http://www.unicode.org/iuc/iuc20
                          January 28-31, 2002
                          Washington, DC, USA

>>>>>>>>>>>>>>>>>>>>>>>>>>>  Register now!  <<<<<<<<<<<<<<<<<<<<<<<<<<<

NEWS

 * Hotel guest rooms still available at the group rate.

 * Visit the Conference Web site ( http://www.unicode.org/iuc/iuc20 )
   to check the updated Conference program and register.  To help you
   choose Conference sessions, we've included abstracts of talks and
   speakers' biographies.

 * The World Wide Web Consortium (W3C) Internationalization Workshop is
   taking place in the same venue, on February 1 -- See the Call for
   Participation ( http://www.w3.org/2002/02/01-i18n-workshop/cfp )

CONFERENCE SPONSORS

   Agfa Monotype Corporation
   Basis Technology Corporation
   Microsoft Corporation
   Netscape Communications
   Oracle Corporation
   Progress Software Corporation
   Reuters Ltd.
   Sun Microsystems, Inc.
   World Bank
   World Wide Web Consortium (W3C)

CONFERENCE VENUE

   Omni Shoreham Hotel
   2500 Calvert Street, NW
   Washington, DC  20008
   USA

   Tel: +1 202 234 0700
   Fax: +1 202 265 7972

GLOBAL COMPUTING SHOWCASE

   Visit the Showcase to find out more about products supporting the
   Unicode Standard, and products and services that can help you
   globalize/localize your software, documentation and Internet
   content.  For details, visit the Conference Web site:
     http://www.unicode.org/iuc/iuc20

   Exhibitors to date include:

   * Agfa/Monotype Corporation
   * Basis Technology Corporation
   * Everlasting Systems Ltd.
   * InfoTech
   * Language Technology Research Center
   * Multilingual Computing, Inc.
   * Rasmussen Software, Inc.
   * SymbioSys, Inc.
   * TRADOS Corporation

CONFERENCE MANAGEMENT

   Global Meeting Services Inc.
   8949 Lombard Place #416
   San Diego, CA 92122, USA

   Tel: +1 858 638 0206 (voice)
        +1 858 638 0504 (fax)

   Email: info@global-conference.com
      or: conference@unicode.org

                             *  *  *  *  *

Unicode(r) and the Unicode logo are registered trademarks of Unicode,
Inc.  Used with permission.


------------------------------------------------------------- ---
        Visit our Internet site at http://www.reuters.com

Any views expressed in this message are those of  the  individual
sender,  except  where  the sender specifically states them to be
the views of Reuters Ltd.