[Spambayes] Query on how SB trains using IMAP

Richard Busby richardb at buzzy.info
Wed May 16 16:33:09 CEST 2007


Hi all,

I've been setting up SpamBayes for my IMAP server, and I've got a question
regarding the filtering mechanism that I can't find an answer to on the
FAQ or list archives. If these are repeat questions, my apologies in
advance - and please point me to where the answers are :)

Setup:
I have a Linux host, running Exim and Courier-IMAP. I'm mainly using the
web-based Squirrelmail to get access to my mails. sb_imapfilter is
scheduled to run hourly with "-c -t -v" flags (so: classify, train,
verbose).

Exim delivers all mail to my Inbox.
The Junk folder contains some samples of known spam.
The Other folder contains known ham.

SpamBayes is set to train with the Other folder as ham, and the Junk
folder as spam. It's set to filter the Inbox, moving spam to the Junk
folder and unsure messages to "JunkUnsure"

Phew!

What I don't understand is how SB avoids being self-reinforcing. For
example, messageA arrives in the Inbox. It's classified by SpamBayes as
being Spam, and is moved to the Junk folder.

When SpamBayes next runs, messageA is now in the Junk folder. Because the
Junk folder is a training folder, does SpamBayes now assume that messageA
is an example of spam that I've put there? Or does it track the fact that
SpamBayes itself moved the message to Junk, and therefore it's not a
message that I gave it as an example?

If messageA is actually ham, does moving it to the "Other" folder and
re-running with "-t" change the database entries to record the fact that
I've identified the message as being ham? I realise the message may still
have the x-spambayes-classification of spam according to the
documentation.

Alternatively, should I not run regularly with the "-t" flag? Should I
occasionally make sure that Junk contains only spam and then run with "-t"
once?


I *hope* I've managed to explain myself clearly here. If not, let me know
and I'll try to explain it differently :)

Cheers
Richard



More information about the SpamBayes mailing list