[Mailman-Users] Using spam filters on subject line

Thu Nov 30 07:39:56 CET 2006

Bob McLeran writes:

 > We're trying to stop emails with the string "test" - and that works with 
 > the filter subject: test currently. What I'd like to do is to be able to 
 > create a rule that would allow a legitimate subject line, like "testing 
 > NMEA circuit" which includes the string "test" to go through without 
 > being blocked

 > It doesn't matter that every email with the string "test" is initially 
 > stopped; we want to then take the legitimate subject lines to create 
 > something (a rule, a pass-through filter, whatever it is called) to 
 > allow subsequent emails with that subject to go to the list without 
 > administrator intervention.

You need a sufficiently recent Mailman version so that the
"subject: test" style filter is called "legacy anti-spam filters" in
the Privacy > Spam Filters section of the admin pages.

First, get rid of those legacy filters.  I'm not sure how they
interact with the new header filters.  Then create Rule 1 with regexp
"^Subject: Testing NMEA circuit$" and action "Accept".  Next an
optional Rule 2 with regexp "^Subject: test$" and action "Reject".
Finally, Rule 3 with regexp "^Subject:.*test" and action "Hold."

These regexps do the following.  First, the subject "testing NMEA
circuits" exactly (except for case---specifically, no whitespace
changes, "Re:" or anything like that) is automatically passed.  If
that didn't match, then anything with the exact subject "test" is
rejected (ie, returned to sender).  If there is still no match, any
other subject containing "test" is held for admin action.  (Only the
last match is "fuzzy".)  Finally, the default action for your list
(presumably "Accept") is taken if none of the above matched.

More likely, rule 1 should be "^Subject:.*Testing NMEA circuit". which
would allow "Re:" and your mailing list "[Short circuits]" prefix,
plus suffixes like "-- another failure".

To add new approved subjects, just add them one per line to rule 1:

^Subject:.*Testing NMEA circuit
^Subject:.*Testing YaddaYadda circuit

You can get fancy with the regexp if you like, but that makes it more
tediious and error-prone to remove approvals (either because you want
the thread to die or for neatness reasons long after it's relevant).
So I recommend one regexp per thread.

Note that the optional reject rule 2 will return the message to sender.
I don't think there's any easy way to configure the rejection message
text, but I would guess that your users will learn to go "oops, bad
subject" when they get one.

I can't go into the details of fuzzy matching (regexps) here, but
there are some details of how this works I should mention.  A mail
message containing attachments or multimedia will have headers
embedded in the body, hidden by your MUA.  Those headers are also
checked (but they rarely include "Subject", except in the case of a
forwarded message).  Also, due to the way mail headers work, you
cannot depend on whitespace being "what you see is what you get".  I
believe that the match is against the full header as decoded (eg
foreign languages), after appending continuation lines.  As far as
efficiency of testing multiple regexps goes, it's almost surely not a
worry.