[Mailman-Users] Dealing with multiple charsets (list messages and web archive)

Stefan Förster cite at incertum.net
Mon May 12 13:16:25 CEST 2008


Hello Mark,

if you feel this is getting too technical for mailman-users, please
let me know.

I'd also like to discuss "chunkify" in SMTPDDirect - is the developers
list the right place for that?

* Mark Sapiro <mark at msapiro.net> wrote:
> Stefan Förster wrote:
>> What kind of signatures do you mean?
> PGP and other signed mail. Domain keys and DKIM.

While the messages I send to _this_ list do contain valid DomainKyes
and DKIM signatures, the mail delivered back to me doesn't verify, for
obvious reasons. My servers remove DK(IM) headers at MTA level. Sorry,
I don't know how properly to describe this in English: The resulting
queue file which gets passed to Mailman does never contain DK(IM)
headers. When Mailman injects a message back to the MTA, the MTA adds
those headers after the whole message was received. I don't know if
this makes any sense to you, but I don't see how flatten.py might
brake that.

As for PGP or S/MIME signatures, you are right, though.

> The handler could come anywhere between MimeDel and ToDigest, but
> between MimeDel and Scrubber may make the most sense.

I placed it there (the description in the FAQ is wonderfully accurate
and easy to follow) and ran some preliminary tests which looked not
too bad so far.

>> If a .txt file without encoding is attached, it is always look if the
>> receiver will be able to read the file. I'd say "gzip it". Really.
> 
> So if I understand you correctly, you could assume per standards that
> any text/plain part without a charset is us-ascii (or any other
> particular charset). This could be accomplished by changing
> 
>         if part.get_content_type() == 'text/plain' and
> part.get_content_charset():
> 
> to
> 
>         if part.get_content_type() == 'text/plain':
> 
> and
> 
>             cset = part.get_content_charset()
> 
> to
> 
>             cset = part.get_content_charset('us-ascii')

Actually, I was serious about gzip'ing those files. I implemented it
that way. After all, if a MUA is broken enough to not declare a
content type, I won't trust it with displaying it's own attachments,
either.

>> After rebuilding the text parts, could we call "decorate" on the
>> message before we attach any other parts?
> 
> 
> That's a bit tricky. If you were to do this, then after calling
> Decorate.process, you would need to set
> 
>     msgdata['nodecorate'] = True

Yes, I've done that - the code in SMTPDDirect makes it perfectly clear
how one has to use Decorate ;)

> so that when Decorate is called again by SMTPDirect, it will just
> return. Also, if you are going to call Decorate from this handler, you
> have a dilema regarding digests. If you call this handler before
> ToDigest, then every message in the digest is decorated with
> msg_header and msg_footer in addition to the digest itself being
> decorated with digest_header and digest_footer. Of course, plain
> digest messages are scrubbed anyway, so if you do defer this handler
> until after ToDigest, you only have to be concerned about the MIME
> digest.
> 
> You also won't be able to have any personalized substitutions in
> msg_header or msg_footer because at this point, you aren't decorating
> individual recipients messages.
> 
> The bigest problem may be that as flatten.py is written, there is no
> point at which msg is the plain text message without attachments. You
> would have to create a text/plain message without the attached parts,
> pass that message to decorate and then add the other parts to the
> decorated message. Or possibly easier, you could call Decorate at the
> beginning before doing anything else, and then flatten the decorated
> message.

I will test calling flatten.py and Decorate.process just before
ToOutgoing.

Anyways, I was testing three mainstream MUAs (Apple Mail, Thunderbird
and the web interface of a popular freemailer), and interstingly
enough, those programs DID display a message with added attachment
correctly. I.e. they were showing the text/plain part of the message
and, without visible boundary, the footer, while making any, say,
application/pdf parts accessible via the "Get attachment"
functionality - IF the text/plain part and the footer had the same
encoding.

Let me add - and this is in now way directed to you, this list or
Mailman (developers) and not meant to be an insult of any kind  - that
the current state of electronic mail is really a mess as far as
charsets or binary contents are concerned.

I do greatly appreciate the time and effort you put into helping me,
thank you very much.


Cheers
Stefan
-- 
Stefan Förster     http://www.incertum.net/     Public Key: 0xBBE2A9E9
Nur wer im Dunkeln rumballert trifft immer ins Schwarze.


More information about the Mailman-Users mailing list