[Mailman-i18n] HTML entities (é) in es, it, no translations

Ben Gertzfield che@debian.org
Thu, 31 Jan 2002 18:02:38 +0900


Working on sending MIME emails through Mailman, I noticed that some
of the translations are inconsistent in how they use HTML entity
escapes.

This becomes a problem when sending email.  An example from the
Spanish translation:

#: Mailman/Cgi/create.py:221 bin/newlist:204
msgid "Your new mailing list: %(listname)s"
msgstr "Su nuebva lista de distribución: %(listname)s"

This is a real problem, because this string is sent literally --
with the string "ó" -- as the subject of the new email
message.

I looked in the HTML 4.01 standard and found that HTML entities are
actually only intended to be used when the document's character set
does not support that particular character.

http://www.w3.org/TR/html401/charset.html has more information on
this.

Since Mailman's CGI interface (in almost all cases) sends the correct
charset in the Content-Type header, I think it's not necessary to use
HTML entity escapes in the gettext catalog files.  In fact, when we do
use escapes, it makes text emails generated by Mailman illegible.

Does anyone have any comments?  I would like to go through the
catalogs and change the HTML escapes back into the original
characters, so that emails Mailman generates are correct again. The
CGI interface will still work as before.

Here is a first guess at which translations include HTML escapes
besides < > and   :

[ben@nausicaa:~/src/mailman/mailman/messages]% egrep '&[^;]+;' **/*.po | egrep -v ' |<|>' | cut -d : -f 1 | uniq

es/LC_MESSAGES/mailman.po
it/LC_MESSAGES/mailman.po
no/LC_MESSAGES/mailman.po

So, the changes would only actually apply to the Spanish, Italian, and
Norwegian translations.  The rest of the translations are correctly
in their original character sets.

Ben

-- 
Brought to you by the letters H and G and the number 18.
"To Perl, or not to Perl, that is the kvetching."
Debian GNU/Linux maintainer of Gimp and Nethack -- http://www.debian.org/