Whitelist/verification spam filters

David Mertz, Ph.D. mertz at gnosis.cx
Wed Aug 28 15:55:53 EDT 2002


|I agree.  What if the original email never got to you due to some mail
|server failure along the way?  Would that count as a false positive?
|(Of course not.)

There is a set of messages that arrives at my computer that I consider
legitimate, call it G. There is another set of messages that arrives at
my computer that I consider illegitimate, call it S.

It I use no filtering software, my INBOX consist of G+S. That's the
baseline.  That's what happens if I make no special effort to install
filtering software.

What I'd like to happen when I install filtering software is that
everything in G goes to INBOX and everything in S goes elsewhere (either
/dev/null or somewhere else).

Inasmuch as filtering software directs some member of G to SPAM-FOLDER,
those are false positives.  Inasmuch as filtering software directs some
members of S to INBOX, those are false negatives.

In general, the filtering software cannot be considered in complete
isolation.  For example, in my testing of Pyzor, I consistently find
that about 0.2% of queries timeout because of network problems (that
number might be different in your network region than in mine).  If I am
contemplating the use of Pyzor, it doesn't do me much good to consider
how well it would perform in an idealized hypothetical world.  I want to
know how well it will sort MY incoming mail, in real life.  In real
life, I must decide whether to direct timeouts to INBOX or to SPAM--one
decision will give me some false negatives, the other will give me some
false positives.

Likewise, if I install TMDA, I want G in INBOX and S in SPAM-FOLDER.  In
the real world, there are many things that can cause members of G to go
to SPAM-FOLDER.  Delivery failures with the challenge or confirmation
are one class of reasons.  People who change email accounts, or maintain
multiple ones are another class.  Business firewalls that block (some)
incoming mail are yet another class.  Forgetful or annoyed
correspondents are another class.  Correspondents who accidentally press
wrong buttons are another class.  Robot mailers from legitimate contacts
are another class.

I don't know exactly how big all those classes I mention are (nor those
that I neglected to mention).  But for every time one of those things
happens, a member of G gets sent to SPAM-FOLDER rather than to INBOX.
In deciding which filtering software to install (if any), it doesn't do
me any good to cast blame somewhere other than on TMDA itself.  Either
way, I don't get a message that I think is legitimate (at least not to
my convenient INBOX... as with other filtering software, I might be able
to manually double check SPAM-FOLDER).

Simple, huh?

Yours, David...

--
---[ to our friends at TLAs (spread the word) ]--------------------------
Echelon North Korea Nazi cracking spy smuggle Columbia fissionable Stego
White Water strategic Clinton Delta Force militia TEMPEST Libya Mossad
---[ Postmodern Enterprises <mertz at gnosis.cx> ]--------------------------





More information about the Python-list mailing list