[Spambayes] Re: unsupervised training
Toby Dickenson
tdickenson@devmail.geminidataloggers.co.uk
Fri Nov 8 10:52:32 2002
On Friday 08 November 2002 7:20 am, Tim Peters wrote:
> Provided the user has already done a decent amount of training, then as
> Paul Moore suggested it could even work to trust ham-vs-spam decisions
> immediately, and let user corrections undo those as needed. A well-tra=
ined
> system should be pretty robust against a few misclassifications over th=
e
> short term.
For the last two weeks I have been using a setup that uses this type of
unsupervised training.
I have a procmail filter that sends a copy of all incoming ham and spam to two
seperate mailboxes. These mailboxes are used for overnight batch training,
then deleted. Messages marked 'Unsure' do not take part in this automatic
training.
I perform seperate filtering for spam and 'unsure' in my mua. Fo far I am
manually inspecting the unsure folder, and manually adding them to the
appropriate training mailboxes. Initially about 3% of mails were 'unsure',
but this has dropped to less than 1% after 2 weeks.
Starting next week I plan to change the mua filtering to treat 'unsure' the
same as 'ham', and stop all manual training. It will be interesting to see if
the training remains stable.
More information about the Spambayes
mailing list