[Spambayes] Re:Spam Filter Based on Bayesian Techniques

dont bother dontbotherworld at yahoo.com
Tue Jan 27 16:23:23 EST 2004


Hey Tony and Matthew,
Thanks for responding!

--- Tony Meyer <tameyer at ihug.co.nz> wrote:
> > I was wondering if anyone could guide/point me an
> > interesting direction in Spam Filtering which may
> be
> > new and I could implement as part of my course.
> > For example: Any  kind of performance evaluation
> etc.
> 
> Is the focus on implementing, or testing?  There's

Well, nothing too stringent. I am planning to do some
really useful work. It can be implementing or testing
though I guess testing would be easier for me as I
have just started playing with python and have not
looked much into the code of Spambayes.

> an almost limitless
> amount of testing that you could do just with
> SpamBayes, along with lots of
> scripts to give you meaningful (hopefully <wink>)
> results.
> 
> One thing you could consider (which is topical
> around here at the moment) is
> implementing forms of self-training, where the
> classifier trains itself,
> based on it's guesses.  There's a heap of stuff in
> the archives (start with
> spambayes-dev) that could get you started.

Yeh! thats an interesting idea. I would certainly go
and look over the archive right now and would write my
project proposal. Can you throw some more light on
your previous words?

Matt:
By the way, I was at MIT for this SpamConference 2004
too where they did discuss the issue of hyperlinks in
the spams.

> 
> > In this case can someone point me to an 
> > architecture of the program and components, since
> I could not 
> > find any paper describing these.
> 
> Well, since there are so many different SpamBayes
> applications, this would
> be pretty complicated.  Basically, though, they all
> go along the lines of:
>  1. Get email as text, somehow.
>  2. Put this through Tokenizer.tokenize
>  3. Put the result of that through
> Classifier.spamprob
> That gives you your classification.  To train, it's
> basically the same
> except step 3 is replaced with putting the result
> through Classifier.learn
> or Classifier.unlearn.
> 
> =Tony Meyer


Thanks


__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free web site building tool. Try it!
http://webhosting.yahoo.com/ps/sb/



More information about the Spambayes mailing list