[Spambayes] Spam at hackers conference

Tim Peters tim.one@comcast.net
Sun Nov 3 07:44:43 2002


[Tim@mail.powweb.com]
> I've *always* suspected that spambayes in combination with other
> technology would present a very powerful anti-spam arsenal.  But
> spambayes by itself is so good, that it may not really require
> supplemental technology.  I say *always* because I've only been in
> this game for a couple weeks... ;)  so what do I REALLY know?

I don't know what to do about opt-in advertising, apart from the obvious:
keep an eye out for it in your Spam folder, and train on it as Ham whenever
it shows up there.  This is effective.

Very brief msgs from rare correspondents seem also to be a problem, because
lots of spam is also very brief (believe it or not <wink>).

python.org has a very specific problem:  the various mailing lists have
*-request addresses, for adminstrivia.  Greg currently whitelists the snot
out of those recipients in SpamAssassin, else a significant percentage of
that traffic would be considered spam.  *This* code appears to be less
willing to call it spam than unfiddled SpamAssassin, but it's still the
major source of FPs in my python.org mail tests.  The kind of FP here has
the single word "unsubscribe" or "help" or "confirm 1534232" buried under
10KB of employer-generated HTML disclaimers, or is sent as a reply to a spam
or conference announcement the poster found objectionable, quoted in full.
Making things worse, "subscribe" and "unsubscribe" are themselves
high-spamprob words.

The FP rate is still very low even with that, but every non-trivial scheme
has non-zero error rates, and that has to be realized.