[Mailman-Users] digests don't show unicode well

Mark Sapiro mark at msapiro.net
Tue Mar 9 21:17:39 CET 2010


Con Wieland wrote:
>
>I'm a little out of my realm here. I have a greek list that is  
>experiencing the following issue:
>
>some text in UTF-8 when it is in digest - or maybe any format.  Also,  
>some web addresses are now in non-latin characters.  Any advice for  
>how to deal with this problem?  I know when I sent the latest message  
>in question it was readable from and on a wireless-connected iPod  
>Touch, but on other occasions I've just seen question marks rather  
>than the Greek letters that were sent. An example follows at the end.


The example at the end is from a 'plain' format digest. The messages in
the plain format digest, while they still show the original
Content-Type: header, are all coerced to the character set of the
list's preferred language which is us-ascii for English and that is
probably what the list's language is since there is currently no Greek
language support.


>> cannot read any
>> unicode (UTF-8) text that appears in messages. This is
>> almost certainly
>> a problem in the way the listserver handles messages. In
>> particular, it
>> does not seem to respect the "Content-Type" tag in the
>> message header,
>> which is correctly set to utf-8, as you can see from the
>> example below.


In the MIME format digest, each message is a separate MIME part, and
the Content-Type: header from the original message is replicated in
the message part's headers, so this should not be an issue with MIME
digests.


>> So far the problem was "just" with being unable to read
>> unicode text
>> written by list contributors. Now, with greek sites
>> beginning to use
>> greek letters as part of their URLs (such as your skai.gr
>> link below),
>> the problem is expanding.
>>
>> Is there a way to make the listserver respect the
>> charset="utf-8"
>> setting?


It does. It knows the message is utf-8 encoded and it knows the plain
format digest is us-ascii so it does the best it can, which isn't very
good in this case.

The user's should try subscribing to the MIME digest and see if that
helps. If it does, set the list's default digest format to MIME for
new subscribers.

Other options include changing the character set for English in Mailman
to utf-8, or creating a Greek i18n for Mailman with an appropriate
character set and setting the list's language to Greek.

You can do the former by putting

add_language('en', 'English (USA)', 'utf-8', 'ltr')

in mm_cfg.py, but this will have possibly undesirable side effects such
as base64 encoding the plain digest, parts of the MIME digest, and
most mailman generated notices from the list. It might be better to
use 'iso-8859-1', but that won't help much with Greek, but you
probably don't want iso-8859-7 because this is a global setting.

If you (or maybe some students) want to do a Greek i18n, I'll help with
the mechanical details.


>> ----- Forwarded message from mgsa-l-request at uci.edu
>> -----
>>
>> Message: 4
>> Date: Sun, 7 Mar 2010 08:41:25 -0800 (PST)
>> From: Roland Moore <rolandmo at pacbell.net>
>> Subject: [MGSA-L] Fwd: [??? ?? ???????? ?????? ??? ?????
>> ????...
>> To: "mgsa-l at uci.edu"
>> <mgsa-l at uci.edu>
>> Message-ID: <862206.6317.qm at web180311.mail.gq1.yahoo.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>> http://www.skai.gr/articles/news/ 
>> views/???????????????????????????????/
>>
>> I was asked to forward the foregoing to the list, to
>> represent another point of view. -Roland
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL: http://maillists.uci.edu/mailman/public/mgsa-l/attachments/ 
>> 20100307/fb970285/attachment-0001.html



-- 
Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan



More information about the Mailman-Users mailing list