[Mailman-Developers] New Pipermail hacks (was Re: Ok, it works! ...)

Fri, 26 Oct 2001 12:01:20 +0900

Barry, thanks lots for the great work!  I love how the email module
has turned out, and the function names you chose really ended up
making sense.

Anyway, I installed the latest mailman CVS and the email module from
the misc/ directory, and successfully created a list with the new
install.

Here's what happened when I posted a message with a GIF, an HTML
part, and a JPEG part (in that order) to the list:

[root@nausicaa:/usr/local/mailman/archives/private/test]# ls -l attachments
total 16
-rw-rw-rw-    1 qmaild   qmail       10404 Oct 26 11:27 attachment-0001.gif
-rw-rw-rw-    1 qmaild   qmail          29 Oct 26 11:27 attachments.pck

[^^^^^^^^ NOTE NOTE NOTE: These arguably should *not* be mode 666. :) ]

[root@nausicaa:/usr/local/mailman/archives/private/test]# cat index.html

[SNIP]
     <P>Currently, there are no archives. </P>
[SNIP]

The offending message is at:

http://nausicaa.interq.or.jp/pipermail/test.mbox/test.mbox

I found the bug; you put in code to ignore all HTML attachments
(whether this is a good idea or not is up to you ;) but the code
in Scrubber.process() suggests that you started coding in an
'outer' attachment detection, but didn't fully implement it.

Here's a patch that actually throws out all-HTML emails, but just
removes HTML parts.  

Actually, why don't we just decode HTML attachments like any other,
and let the user beware if they want to click on it?  There are lots
of legitimate reasons to allow HTML attachments.  I can't think of any
to allow all-HTML messages. *grin* We could treat all-HTML messages in
the same way, just provide a link and let the user beware if they
click on it.

The patch also adds a filename to the replacement payload, so that
users can have an idea of what they're going to see if a description
was not provided (VERY common).  

Finally (and this part of the patch, I'm not quite sure if it's the
right solution), we add http://mlist.host_name to the beginning of the
URL returned by Scrubber.save_attachment.  Why?  Because pipermail
sees the string "/pipermail/listname/attachments/attachment-0001.gif"
and doesn't (of course) realize it's a URL!

The patch does not address the mode 666 issue.  I don't know where it
should be set, but you need to make the umask set to make these files
not be readable/writable by others..

Anyway, the output should be cleaned up a bit, but Barry, this is
a great leap forward for Mailman's MIME handling!  I'll be working
on this more later.  Thanks a lot!

Index: Scrubber.py
===================================================================
RCS file: /cvsroot/mailman/mailman/Mailman/Handlers/Scrubber.py,v
retrieving revision 2.1
diff -u -r2.1 Scrubber.py

--- Scrubber.py	2001/10/25 04:10:23	2.1
+++ Scrubber.py	2001/10/26 03:00:41
@@ -60,7 +60,6 @@
 
 
 def process(mlist, msg, msgdata=None):
-    outer = 1
     for part in msg.walk():
         # If the part is text/plain, we leave it alone
         if part.get_type('text/plain') == 'text/plain':
@@ -70,7 +69,7 @@
         # whole message is HTML, just discard the entire thing.  Otherwise,
         # just add an indication that the HTML part was removed.
         if part.get_type() == 'text/html':
-            if outer:
+            if not msg.is_multipart():
                 raise DiscardMessage
             part.set_payload(_("An HTML attachment was scrubbed and removed"))
         # If the message isn't a multipart, then we'll strip it out as an
@@ -82,9 +81,11 @@
             size = len(payload)
             url = save_attachment(mlist, part)
             desc = part.get('content-description', _('not available'))
+            filename = part.get_filename(_('not available'))
             part.set_payload(_("""\
 A non-text attachment was scrubbed...
 Type: %(ctype)s
+Name: %(filename)s
 Size: %(size)d bytes
 Desc: %(desc)s
 Url : %(url)s
@@ -155,5 +156,5 @@
     fp.write(decodedpayload)
     fp.close()
     # Now calculate the url
-    url = mlist.GetBaseArchiveURL() + '/attachments/' + file + ext
+    url = 'http://' + mlist.host_name + mlist.GetBaseArchiveURL() + '/attachments/' + file + ext
     return url


-- 
Brought to you by the letters W and B and the number 14.
"It should be illegal to yell 'Y2K' in a crowded economy."
Debian GNU/Linux maintainer of Gimp and GTK+ -- http://www.debian.org/