[Mailman-Users] mails send from a webmail system dont get a Subject_prefix if there are umlauts

Mark Sapiro msapiro at value.net
Thu Dec 7 21:30:05 CET 2006


Mark Sapiro wrote:
> 
> I am able to duplicate the problem and will look at it further, but the 
> real solution is to use MUAs that create standards conformant messages


If you are unable to convince people to use compliant MUAs, you can 
patch Mailman in a couple of different ways to address this problem.

If you are willing to have the non-ascii characters replaced by '?' in 
the subject, you can patch Mailman/Handlers/CookHeaders.py as follows 
(line numbers for Mailman 2.1.9)

--- Copy of CookHeaders.py      2006-09-25 11:56:24.265625000 -0700
+++ CookHeaders.py      2006-12-07 12:13:10.011544900 -0800
@@ -257,7 +257,7 @@
      # range.  It is safe to use unicode string when manupilating header
      # contents with re module.  It would be best to return unicode in
      # ch_oneline() but here is temporary solution.
-    subject = unicode(subject, cset)
+    subject = unicode(subject, cset, 'replace')
      # If the subject_prefix contains '%d', it is replaced with the
      # mailing list sequential number.  Sequential number format allows
      # '%d' or '%05d' like pattern.

In your case, since the unencoded characters in the subject are probably 
iso-8859-1 characters, you could instead (or in addition) apply (watch 
for wrapped line below)

--- Copy of CookHeaders.py      2006-09-25 11:56:24.265625000 -0700
+++ CookHeaders.py      2006-12-07 12:13:10.011544900 -0800
@@ -337,7 +337,7 @@
          # MUA deliberately add trailing spaces when composing return
          # message.
          d = [(s.rstrip(), c) for (s,c) in d]
-        cset = 'us-ascii'
+        cset = 'iso-8859-1'
          for x in d:
              # search for no-None charset
              if x[1]:
@@ -349,4 +349,4 @@
          return oneline.encode(cset, 'replace'), cset
      except (LookupError, UnicodeError, ValueError, HeaderParseError):
          # possibly charset problem. return with undecoded string in 
one line.
-        return ''.join(headerstr.splitlines()), 'us-ascii'
+        return ''.join(headerstr.splitlines()), 'iso-8859-1'



-- 
Mark Sapiro <msapiro at value.net>       The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan


More information about the Mailman-Users mailing list