[Mailman-Developers] Debugging mailman (resolving inline signature attachments)

Thu Sep 28 10:55:14 CEST 2006

Hello there. How would I go about debugging mailman, by --
for example -- polluting the .py libraries with print or
assert statements? If I do that (which seems like a pretty
thoughtless idea), the output would end up .. in /dev/null?
It would probably be better to write to file in stead?

Right now I'm having a specific problem, but it would be
nice to have some general debugging methods to use in the
future for other problems.

I'm using version 2.1.5-8sarge2 (yes, Debian sarge).

The current problem is that list msg_footer, that is list
signatures, are inline attached as base64- encoded UTF-8
mime parts, but only in some cases:

Original message: iso8859-15 text/plain.
Result: Mailman distributes the message to the list members
as mime multipart, first part is the original iso8859-15
message, and the list footer is appended like this:

 --===============0410419818==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
Content-Disposition: inline

WWVzLCB0aGlzIGlzIG5vdCB0aGUgcmVhbCBsaXN0IHNpZ25hdHVyZSwgeW91IGhheG9yIG
NyYXhvcg==

 --===============0410419818==--

The same happens for charset 'us-ascii' and 'iso8859-1',
which were some examples I could find now after a quick
look. When the original message is UTF-8 text/plain,
however, the body and signature are merged together as they
should be. The footer contains *only* standard 7-bit ascii
values.. no 8-bit stuff or strangeness.

Now, mailman determines this footer business in
Handlers/Decorate.py on lines 80 to 110 (roughly).  It calls
the following to determine list charset: lcset =
Utils.GetCharSet(mlist.preferred_language) GetCharSet calls
mm_cfg.LC_DESCRIPTIONS, which imports Defaults.py, where --
since my lists have lang set to 'en' -- 'us-ascii' is
returned. I also tried mapping 'en' to 'utf-8', without
success.

This is the snippet that determines if the message should be
multipart or not (linenums included):

86 if not msg.is_multipart() and msgtype == 'text/plain' and \
87   msg.get('content-transfer-encoding', '').lower() <> 'base64' and \
88   (lcset == 'us-ascii' or mcset == lcset):

The incoming messages from users are:
- not multipart
- charset is text/plain
- content-transfer-encoding is mostly '7bit' (not base64)
- lcset *should* be 'us-ascii', but it appears it might not be
- mcset (message charset) is what the user used, normally
  iso8859-1, iso8859-15 or utf-8.

I'd really like to debug that code passage to verify what
variables are screwed up, and which are not.

Any tips?

best regards,
sven