[Mailman-Users] Expressions to reduce spam

Stephen J. Turnbull turnbull.stephen.fw at u.tsukuba.ac.jp
Mon Nov 28 21:46:41 EST 2016


Cyndi Norwitz writes:

 > As for "everyone should learn regular expressions”...

Not everyone.  Just *some* list admins whose security professionals
are unresponsive, or who are their own security admins.

 > Sure, maybe. But I think it’s overkill.  I mean I don’t require
 > all my soap customers to learn the chemistry of saponification.
 > I mean, when I started making websites back in the mid-1990's, I
 > hand-coded in HTML.  I could switch to my browser to see where I
 > messed up (cause you will always mess up), go back, fix, try again.
 > Now I use Dreamweaver.

Your metaphors are not valid.  Selling soap and creating HTML are
cooperative activities.  Both of you *want* simple, and the
combination of soap and customer's skin, or you with Dreamweaver and a
reader with a reasonably modern browser, gives a win-win outcome.  But
spamming is a zero-sum game, and spammers are intelligent opponents,
not reducible to counting the number of electrons in the outermost
orbit.  There *may* be ways to simplify, but they will be rare, and
spammer- and list-specific.

The most important point is that, while human beings can reliably
recognize spam before they have time to think about why they think
it's spam, computers can't do that at all.  They have to specifically
apply heuristic rules, all of which frequently fail very badly.  Each
rule frequently allows whole streams of spam through, and sometimes
identifies whole streams of authentic mail as spam.  That's why the
effective applications like SpamAssassin and SpamBayes use scoring of
many rules, and site-specific tweaks to scores, rather than using
litmus tests that trigger a discard on one feature as Mailman does.

There is a clear and present danger that each simple litmus test will
throw away authentic posts without stopping enough spam to make that
risk worthwhile.  It is that danger that leads us to prefer consulting
for each admin who needs an apparently simple rule, and setting a high
bar (regexps) for DIY spam-fighting.  Providing a way to configure
simple rules would most likely be an attractive nuisance.  But people
who have the knowledge and experience to use regexps not only can
write more sophisticated (though still risky) rules, but also have
experienced the limits of automated filters.

We could make it easier, but it's not at all clear to me that we
should.

Steve



More information about the Mailman-Users mailing list