Whitelist/verification spam filters

David LeBlanc whisper at oz.net
Tue Aug 27 21:02:00 EDT 2002


I think a whitelist has it's uses, but I don't think that rejecting mail is
one of them. I think it's most appropriate use is to quickly identify wanted
emails and pass on other tasks to other filters in the processing chain.

I see this chain as looking like this:

  * Whitelist:		mail I'm sure I do want.
  * Blacklist:		mail I'm sure I do not want.
  * Content Filter:	sort mail into appropriate folders based on content.
  * Content Processor:	custom processing of mail, such as for a mailing
list.

Whitelist and Blacklist are for speed in winnowing the known endpoints;
Content Filter is where Graham-Bayes and/or naive Bayes (as iFile uses)
filtration comes in. (Content Processor is really just a place holder for a
general idea at this point.)

What I have not mentioned and what is the topic of this thread is how the
lists and filter are populated. I think that either a manual system (i.e.
insert the appropriate email address or re into a list or identify it's
correct folder for the content filter) or an automated system such as Edward
Reem uses/espouses may work in different situations and for different people
according to preference. The content filter automatically learns to improve
it's filtering based on the number of messages it's seen and how they have
been dispersed to the various folders by the user (from which it should
learn).

David LeBlanc
Seattle, WA USA






More information about the Python-list mailing list