[Mailman-Users] UnicodeDecodeError during Archive Obscuring

Tokio Kikuchi tkikuchi at is.kochi-u.ac.jp
Sun Nov 25 05:07:27 CET 2007


Hi,

> I've seen a increase in use of Unicod'ed email address which fail
> HyperArch email obscuring and thus cause the msg to be shunted and not
> archived.  
> 
> Specifically the problem lies in the encoding (strangely the error says
> "Decode") of the text ' at ' (as a substitute for "@")  for Russian
> users of GMail.
> 
> Here is the error in full:

>   File "/usr/local/mailman/Mailman/Archiver/HyperArch.py", line 579, in
> as_text
>     atmark = unicode(_(' at '), Utils.GetCharSet(self._lang))
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 1:
> ordinal not in range(128)
> 
> Nov 22 04:42:31 2007 (755) SHUNTING: 1195717174.0985031
> +6fdaf61658ca76ec5281e614be7b4b59d1e01bf6
> ----------------------------------------------------------------------------
> 
> This is a Mailman 2.1.9 system.
> 
> I did search the archives quite extensively, but I didn't find any cases
> where Mailman was having trouble encoding the hard coded ' at ' into
> unicode.  

It is not ' at ' itself but it's translation which caused this error.
It's strange though the language set immediately before should work for
its unicode conversion.
> 
> Any ideas on what to try/test/do?

How about this patch (sorry for the word wrap) for work around.

=== modified file 'Mailman/Archiver/HyperArch.py'
--- Mailman/Archiver/HyperArch.py       2007-11-21 05:21:24 +0000
+++ Mailman/Archiver/HyperArch.py       2007-11-25 03:59:07 +0000
@@ -412,7 +412,8 @@
                 otrans = i18n.get_translation()
                 try:
                     i18n.set_language(self._lang)
-                    atmark = unicode(_(' at '),
Utils.GetCharSet(self._lang))
+                    atmark = unicode(_(' at '),
Utils.GetCharSet(self._lang),
+                                     'replace')
                     subject = re.sub(r'([-+,.\w]+)@([-+.\w]+)',
                               '\g<1>' + atmark + '\g<2>', subject)
                 finally:
@@ -574,7 +575,7 @@
         if mm_cfg.ARCHIVER_OBSCURES_EMAILADDRS:
             otrans = i18n.get_translation()
             try:
-                atmark = unicode(_(' at '), cset)
+                atmark = unicode(_(' at '), cset, 'replace')
                 i18n.set_language(self._lang)
                 body = re.sub(r'([-+,.\w]+)@([-+.\w]+)',
                               '\g<1>' + atmark + '\g<2>', body)


-- 
Tokio Kikuchi, tkikuchi at is.kochi-u.ac.jp
http://weather.is.kochi-u.ac.jp/


More information about the Mailman-Users mailing list