[Mailman-Users] Newbie - many questions and ideas but mostly how to kill the SPAM

Sat May 18 00:15:10 CEST 2002

>>>>> "SW" == Simon Waters <Simon at wretched.demon.co.uk> writes:

    SW> Unlike many Mailman users these are both open lists
    SW> (i.e. anyone can report bugs!), and have various NNTP gateway
    SW> facilities to "gnu.chess". We have a spam double whammy -
    SW> newsgroups and a well known and widely spammed email addresses
    SW> for both lists.

I feel your pain.  python-list at python.org is gatewayed with
comp.lang.python (similarly python-announce-list and c.l.py.announce,
but that's a moderated newsgroup).

    SW> 1. Can a regular expression be fatal rather than just
    SW> requiring moderation. I have a number of rules with 100% hit
    SW> rate - such as trapping all messages flagged as coming from
    SW> open relays. But the mail admins refuse to drop such mail at
    SW> the MTA level alas.

Not currently (in MM2.1).  I've been thinking about splitting
bounce_matching_headers into three related options:
reject_matching_headers, discard_matching_headers, and
hold_matching_headers.

    SW> 2. How would I write a regular expression in Mailman Privacy
    SW> options to trap Korean characters in the subject line, use of
    SW> 8 bit ASCII characters, or even all subjects containing Ä
    SW> (Capital A umlaut) would be sufficient.

Hmm, that's a good one.  Perhaps (untested) something like [\177-\377]?

    SW> 3. Does better documentation on the regular expression
    SW> handling exist, currently only one example is given for a
    SW> "from:" header.  Maybe I'm being thick, but regex regular
    SW> expressions are pretty involved and depend on things like NLS,
    SW> and I'm fairly sure that Mailman is doing a simplified
    SW> versions. Or must I read the source?

The definitive guide is in the Python library reference manual:

http://www.python.org/doc/current/lib/module-re.html

Almost all regular expression matches done with listadmin supplied
regexps are done with the IGNORECASE flag.

    SW> 6. Including the REGEX in the rejection reason would aid
    SW> writing good regex.

Good idea, but probably post-2.1.

    SW> 7. A regex test would be handy - although I guess you'd have
    SW> to apply it to archive messages or some such to get test data.

Another good idea, but for post-2.1.

    SW> 8. A date sanity check - moderate or bounce misdated messages
    SW> (I guess I can use REGEX if I'm clever), a lot of the spam has
    SW> dates in the future or distant past. We already have archives
    SW> to 2034 and not much of it is useful. Anything more than two
    SW> days ahead of current system time is clearly junk IMHO ;)

:)  MM2.1 will have a setting to "fix" outrageously incorrect dates,
current defaulted to 15 days before or after the received time.

-Barry