[Mailman-Developers] Mailman CVS sends out Japanese template mails in EUC-JP

Sat, 8 Sep 2001 00:28:35 -0400

>>>>> "BG" == Ben Gertzfield <che@debian.org> writes:

    BG> Looking at this problem some more, it seems that Mailman has
    BG> changed how system notification mails are sent quite a lot
    BG> between 2.0 and CVS.

Indeed! :)

    BG> Now they're inserted into a "virgin queue", which seems to not
    BG> go through the normal pipeline.

This is true, but VirginRunner will do a little processing on the
message and then drop it in the outgoing queue.  From there the
OutgoingRunner will attempt to deliver it via the DELIVERY_MODULE.  At
least, that's how it's supposed to work!

    BG> But I'd think at least the Content-Type: header needs to be
    BG> modified, as well as converting the contents of the European
    BG> internationalized mails to quoted-printable, before sending
    BG> out these kinds of messages.

    BG> How should we approach modifying the virgin queue?  I can hack
    BG> in conversion to ISO-2022-JP and adding the headers, but that
    BG> seems wrong somehow.  Maybe have each language supply its own
    BG> special "incoming mail charset conversion", "outgoing mail
    BG> charset conversion", and "header additions" modules?  I know
    BG> Japanese needs to convert incoming mails to EUC before they're
    BG> archived, and back to ISO-2022-JP when they go back out to the
    BG> list.

We only have this problem for messages that Mailman generates, right?
IOW, for messages sent to the list by members, we're adhering to
least-munging principles, so if someone sends a message to the list
all bolluxed up, tough luck.

Under that assumption, here's a strawman design for doing this in an
extensible way:

We extend the VirginRunner pipeline so that just before ToOutgoing,
Mailman will send the message through a language-specific handler
module.  We use the list's default language code to calculate the name
of the handler module.  E.g. for Japanese, we'd use something like
Mailman.Handlers.LanguagePrep_ja.py.

If "LanguagePrep_<code>py" doesn't exist, we don't do anything.
Otherwise, the module has the same signature as other handler modules
and of course, can do any necessary message munging.  For Japanese,
this should subsume the ja_to_EUC_JP.py and ja_SMTPDirect.py modules,
right?

Now, for archiving, we'd do something similar.  I could see a couple
of options.  Either we re-use LanguagePrep_<code>.py and call a
different function in the module, or we call the same function and
provide a flag (or more likely a msgdata entry) that tells the module
which direction to do the conversion.

Or we use two separate modules, something like
LangPrepOutgoing_<code>.py and LangPrepArchiving_<code>.py.  I'm not
sure what the best solution is because I don't know how much can be
shared between EUC->ISO-2022-JP and ISO-2022-JP->EUC.

Please let me know if you think this will help solve the problem.

-Barry