Spam collection

Aidan Finn aidan.finn at ucd.ie
Tue May 1 06:24:05 EDT 2001


In article <pkvH6.89$Qj7.5830 at news.get2net.dk>, "Mikkel Rasmussen"
<footech at get2net.dk> wrote:


> My spam filter "idea" is to use keywords, because I use Outlook and
> Outlook does not give any other possibilities (as far as I know). The
> problem is in choosing the best keywords ...

You might try the rainbow text classifier 
(http://www.cs.cmu.edu/afs/cs/project/theo-11/www/naive-bayes.html)
to find discover the most informative words for junk e-mail.
There are some papers on using baysian classification to do this kind of
filtering. The paper "A Bayesian Approach to Filtering Junk E-Mail" and
kushmericks adeater system spring to mind. If your interested these can
probably be found on citeseer (http://citeseer.nj.nec.com/cs).


Let me know if this is useful.

AF



More information about the Python-list mailing list