[Mailman-Developers] Mailman CVS sends out Japanese template mails in EUC-JP

Ben Gertzfield che@debian.org
Tue, 11 Sep 2001 17:40:06 +0900


I've come to the same understanding as Barry as far as this language
stuff goes.  We can actually do this without changing the message's
encoding right when a message comes in.

The only times when a message is displayed via the web are when it
needs approval for some reason or another, or from HyperArch.

So, we need to change Cgi/admindb.py to properly Web-ify the headers
and body; currently it displays the message verbatim, which is fine
unless you use a charset that must be quoted-printable or base64
encoded in the headers -- or if your message charset doesn't match
your web charset.

Next, we need to change pipermail and/or HyperArch to check if the
charset in the decoded subject/author headers is iso-2022-jp; if so,
we need to use kconv to convert it to EUC-JP.  Similarly, we need to
check the body, even if it's multi-part, and convert to EUC-JP the
parts that are iso-2022-jp.

Looking around, I found the very useful EncWord.py module, which for
some strange reason was in Mailman, not in mimelib.  It was probably
written before mimelib was, but the functionality should certainly be
moved there, as it's part of the MIME standard and will be needed for
anyone doing mail outside the English world.

Barry, will you think about moving EncWord into mimelib?  It'd be much
more at home there. :) It could be moved whole-hog, and the only
thing that would need a s/Mailman.EncWord/mimelib.EncWord/ would be
Archivers/HyperArch.py, as far as I can see.

Here's a patch to add a new function to mimelib.Message.  It returns
the charsets of each of the text/* parts in a message, and None for
each part that is not text or does not have a charset.  This will make
converting the iso-2022-jp parts much easier.

This patch is against mimelib CVS.

Index: Message.py
===================================================================
RCS file: /cvsroot/mimelib/mimelib/mimelib/Message.py,v
retrieving revision 1.13
diff -u -r1.13 Message.py
--- Message.py  2001/05/04 18:47:22     1.13
+++ Message.py  2001/09/11 08:37:28
@@ -272,3 +272,31 @@
             if name.lower() == param:
                 return address.unquote(val)
         return failobj
+
+    def getcharsets(self, default=None):
+        """Return an array containing the charset[s] used in a message.
+    
+        Returns an array containing one element for each part of the
+        message; will return an array of one element if the message is not
+        a multipart message.
+        
+        Each element will either be a string (the charset in the
+        Content-Type of that part) or the value of the 'default'
+        parameter (defaults to None), if the part is not a text part
+        or the charset is not defined.
+        """
+        result = []
+        
+        if self.ismultipart():
+            for p in self.get_payload():
+                if p.getmaintype() == "text":
+                    result.append(p.getparam("charset"))
+                else:
+                    result.append(default)
+        else:
+            if self.getmaintype() == "text":
+                result.append(self.getparam("charset"))
+            else:
+                result.append(default)
+
+        return result


-- 
Brought to you by the letters S and T and the number 17.
"A baloo is a bear."
Debian GNU/Linux maintainer of Gimp and GTK+ -- http://www.debian.org/