[Spambayes] full o' spaces

Tim Stone - Four Stones Expressions tim at fourstonesExpressions.com
Sat Mar 8 09:25:21 EST 2003


3/8/2003 2:45:00 AM, Anthony Baxter <anthony at interlink.com.au> wrote:

>We can sit here for days, weeks and months and think of ways to defeat
>the existing classifier. We have done that, in the past. But a change that
>is not tested and shown to improve existing results, does _not_ belong 
>in the code base. It goes against _everything_ that has made this project 
>successful. 

Ok, so let me summarize what I think our discussion has boiled down to.

1. We will not make changes that regress our results on existing spam. 

2. We will engage in ongoing analysis of spam, keeping our testing corpora up 
to date as best we can.  When significant (we have yet to define significant) 
amounts of FN start happening, we will adjust the tokenizer appropriately.

Point 1 is a given.  There seems to be considerable inertia in the project 
toward using point 2 as an ongoing strategy.  I can live with it, because 
there's tremendous value in what we're doing, and it really does work.  I just 
have to say, though, that from a marketing viewpoint (believe it or not, I was 
a marketer in a former life), this strategy can potentially shoot us in the 
foot, because we aren't the ones finding problems, spammers are, and I think 
this could cause our users to lose faith in our product.  "I trained this 
stuff as spam, and this thing STILL doesn't catch it."  If that happens to a 
user more than a few times, the conclusion will be that it doesn't work.  I'm 
telling you, it doesn't take but one bad article in a ZD publication, and it's 
all over with for us.

Ok, I'm off my soapbox. <smile>  This has been a great discussion.


c'est moi - TimS
http://www.fourstonesExpressions.com
http://wecanstopspam.org





More information about the Spambayes mailing list