Spam collection
Aidan Finn
aidan.finn at ucd.ie
Tue May 1 06:24:05 EDT 2001
In article <pkvH6.89$Qj7.5830 at news.get2net.dk>, "Mikkel Rasmussen"
<footech at get2net.dk> wrote:
> My spam filter "idea" is to use keywords, because I use Outlook and
> Outlook does not give any other possibilities (as far as I know). The
> problem is in choosing the best keywords ...
You might try the rainbow text classifier
(http://www.cs.cmu.edu/afs/cs/project/theo-11/www/naive-bayes.html)
to find discover the most informative words for junk e-mail.
There are some papers on using baysian classification to do this kind of
filtering. The paper "A Bayesian Approach to Filtering Junk E-Mail" and
kushmericks adeater system spring to mind. If your interested these can
probably be found on citeseer (http://citeseer.nj.nec.com/cs).
Let me know if this is useful.
AF
More information about the Python-list
mailing list