[Spambayes] Re: unsupervised training

Toby Dickenson tdickenson@devmail.geminidataloggers.co.uk
Fri Nov 8 10:52:32 2002


On Friday 08 November 2002 7:20 am, Tim Peters wrote:

> Provided the user has already done a decent amount of training, then as
> Paul Moore suggested it could even work to trust ham-vs-spam decisions
> immediately, and let user corrections undo those as needed.  A well-tra=
ined
> system should be pretty robust against a few misclassifications over th=
e
> short term.

For the last two weeks I have been using a setup that uses this type of 
unsupervised training.

I have a procmail filter that sends a copy of all incoming ham and spam to two 
seperate mailboxes. These mailboxes are used for overnight batch training, 
then deleted. Messages marked 'Unsure' do not take part in this automatic 
training.

I perform seperate filtering for spam and 'unsure' in my mua. Fo far I am 
manually inspecting the unsure folder, and manually adding them to the 
appropriate training mailboxes. Initially about 3% of mails were 'unsure', 
but this has dropped to less than 1% after 2 weeks.

Starting next week I plan to change the mua filtering to treat 'unsure' the 
same as 'ham', and stop all manual training. It will be interesting to see if 
the training remains stable.





More information about the Spambayes mailing list