[Mailman-Users] Re: [Tutor] A question about Mailman soft. [hacking Mailman for fun and profit]

Fri Jul 26 11:22:16 CEST 2002

[Note: I'm CC'ing mailman-users as this might be useful for them.
Hopefully, they'll correct my hack by telling me the right way to do this.
*grin*]

On Fri, 26 Jul 2002, Ares Liu wrote:

> I checked the archive mail on mailman list. Some one had discussed this
> question before.

Do you have a link to that archived message?  I'm interested in looking at
this, just for curiosity's sake.

> The reason is if I use no English words in the Subject Line, The
> language code marker will added in fornt of "Re:"and encoding the
> Subject as sth like "=?gb2312?B2xxxxxxxx?=".

Yes, it looks like it wraps it in some kind of encoding... utf-8?  I wish
I knew more about Unicode.

> It is surely that mailman could not search any reply keyword. So, added
> prefix again.

I think I understand better now.  The problem is that the encoding leaves
many of the characters alone, but transforms the braces in:

    '[Tutor]'

to something like:

    '=5BTutor=5D'

I'm guessing this because 0x5b and 0x5D are the ascii codes for braces:

###
>>> chr(0x5b)
'['
>>> chr(0x5d)
']'
###

Hmmmm.  Wait.  I've seen these characters before.  Is this MIME encoding?
MIME encoding is often used in representing language strings in email
because almost all systems can handle it.

###
>>> def mydecode(s):
...     outputfile = StringIO.StringIO()
...     mimetools.decode(StringIO.StringIO(s), outputfile,
'quoted-printable')
...     return outputfile.getvalue()
...
>>> mydecode('=5BTutor=5D')
'[Tutor]'
###

Ah ha!  It looks like it!  Good!

In this case, maybe we can extend that check in
Handlers.CookHeaders.process() to take this particular encoding into
consideration: if we decode the header back to normal, then the prefix
check will work.

If you're feeling adventurous, and if you're comfortable editing Python,
you can add this file, 'quoted_printable_decoder.py' in the
'Mailman/Handlers/' directory of Mailman:

######
## quoted_printable_decoder.py

import StringIO, mimetools
def decode_quoted_printable(s):
    """Given a mime 'quoted-printable' string s, returns its decoding.
If anything bad happens, returns s."""
    try:
        outputfile = StringIO.StringIO()
        mimetools.decode(StringIO.StringIO(s), outputfile,
                         'quoted-printable')
        return outputfile.getvalue()
    except:
        return s
###

This new module will convert the header and change all the '=5B' and '=5D'
characters back into braces if it can do so safely.  We'll be using it in
a moment.

Once you've added this module, within the same directory, let's modify
CookHeaders.py to use this function.

And make backups, because I have not tested this yet!  *grin*

Add at the top of the CookHeaders module:

###
from quoted_printable_decoder import decode_quoted_printable
###

so that Cookheaders knows about our new function.  Finally, modify the
check in the Cookheaders.process() function:

###
        elif prefix and not re.search(re.escape(prefix), subject, re.I):
###

into:

###
        elif prefix\
             and not re.search(re.escape(prefix), subject, re.I)\
             and not re.search(re.escape(prefix),
                               decode_quoted_printable(subject), re.I)
###

I've modified the logic to include the prefix check on the decoded subject
header.  Ares, if this works, I'll send the patch over to the Mailman
folks.  Who knows; it might be useful for someone else out there.  *grin*

Best of wishes to you!