[Mailman-Users] Umlaut in message bodies are omitted

Ben Gertzfield che at debian.org
Fri Jun 13 19:44:24 CEST 2003


Bausch, Jean wrote:

>I have now found who swallows the Umlauts in message bodies:
>
>I had inserted a 'fmt -s -w 80' in the mail aliases. Unfortunately on Solaris 7 this command erases the Umlauts.
>  
>

Interesting.  You can try changing that to

LANG=de_DE fmt -s -w 80

to make sure Solaris can handle the 8 bit characters.  But this assumes 
every email you put through will be in iso-8859-1, which is increasingly 
not true in today's international world.

If you don't specify a LANG, or specify the wrong one, fmt doesn't know 
what character set the input file is in, and if it's a multibyte 
character set (UTF-8, UTF-16, or any Japanese/Chinese/Korean character 
set) it might end up splitting a character between bytes when it inserts 
a newline.  So they do the "safe" thing and strip all 8-bit characters 
when LANG is not set.

>Hence my next question:
>Is there an option in mailman 2.1.2 to autowrap the messages after 80 characters or so?
>  
>

I don't think there's really a safe way to do this.  It would certainly 
screw up signatures on emails.

All MIME emails are required to be 76 characters or less per line; you 
could write a Handler (look at Mailman/Handlers/Decorate.py for an 
example in the Mailman source code) to take any non-MIME emails and 
automatically make them a single text/plain attachment encoded with 
quoted-printable, but you'll need to be careful about character set 
issues for the UTF-8 and CJK users.

The email.Message and email.Charset modules make this really easy; all 
the work is already done for you.

Ben






More information about the Mailman-Users mailing list