[Spambayes] Outlook plugin - training

Paul Moore lists@morpheus.demon.co.uk
Wed Nov 6 23:31:35 2002


"Mark Hammond" <mhammond@skippinet.com.au> writes:

> ie, it is indeed "mistake based training", but that may still prove useful
> in addition to ongoing training.

>From a newcomer's point of view, I think a key point is that "mistake
based training" is easy to understand.

I also believe that "confirmation based training" (my "clever boy!" 
button for specifically affirming that the classifier's magic gave the
right answer) is easy to understand. More than that, a new user
*expects* to need to do something like this, as the initial impression
is one of amazement at the accuracy of the classifier. But such a
gadget will fall into disuse as the user starts to expect the
classifier to be right - so it probably doesn't have enough long-term
value to be worth providing.

Batch training (keeping ham and spam, and pumping it into the
classifier in a regular training run) feels highly unnatural. My
instinct is to *delete* spam - keeping it feels wrong.

> I can't help thinking that we are somehow underestimating our own tool here.

Coming at it from cold, I can confirm that the effect feels like pure
magic. I trained on what I thought was a uselessly small corpus (I had
*no* historical spam, so I retrieved the day's batch from the wastebin
and used that). The results have been so good that I can already, 2
days later, feel myself tending to "trust" the classifier, and
forgetting about training issues.

But unlike Mark, my instinct is that this is not such a good thing
(solely from a training point of view). If people get such good
results on inadequate training, they won't work at it enough, so the
need is to make good training so easy and automatic that the tendency
to forget to bother is offset.

It's too late to think this through right now. I'll ponder some more
in the morning...

Paul.

-- 
This signature intentionally left blank



More information about the Spambayes mailing list