[Spambayes] Re: Practical applications

David Eppstein eppstein@ics.uci.edu
Sat, 21 Sep 2002 23:47:47 -0700


In article <LNBBLJKPBEHFEDALKOLCMEMPBFAB.tim.one@comcast.net>,
 Tim Peters <tim.one@comcast.net> wrote:

> > Or, I could be sued for patent infringement   :-(
> 
> Note that MS's patent is on Vector Support Machine anti-spam technology.
> We're not anywhere close to that, and Bayesian learning techniques have been
> in the open literature for more than 40 years.

That would be Support Vector Machines.  They are one of the standard 
techniques for classification problems such as this one.  I haven't 
tried looking at the patent, and I'm not a classification expert (we 
have other people in my dept. who are), but the only possible new idea 
(which I doubt is new to MS, and it's very obvious) would be the same 
one you're using here: try using mildly sophisticated but standard 
classification algorithms instead of ad-hoc pattern matching in the 
specific area of spam detection.

Speaking of which, has anyone tried boosting?  You should be able to 
plug it in on top of other methods such as the one you're doing here.

-- 
David Eppstein       UC Irvine Dept. of Information & Computer Science
eppstein@ics.uci.edu http://www.ics.uci.edu/~eppstein/