[Spambayes] Question about training via the web interface

Tony Meyer tameyer at ihug.co.nz
Mon Apr 12 23:20:28 EDT 2004


> I RTFMed and could not find an explanation for this.

You need to RTFW: <http://entrian.com/sbwiki>, <wink>.

> 3. SB was %100 correct in its analysis, and I do not
> need to click on any message to change its category.
>
> My question is: in this case, does it make any sense
> to click on the 'train' button? As I understand it,
> for those messages, SB it does not need any further training.
>
> Am I right or am I right (:-) ?

Maybe.  There hasn't really been enough testing on different training
methods to be able to make a conclusive statement.  However, IMO (based on
testing and reading mail here):

  1.  Training on everything is not a good idea.
  2.  The three best training methods (so far) are "mistake based training"
(train false positives, false negatives, unsures), "non-edge training"
(train everything inside certain edges, say 0.05 and 0.95), and "train to
exhaustion" (complex; there's a Robinson blog about it).  From what I've
seen, I'd say that "train to exhaustion" gives the best results, but is very
slow, and the other two are about even - nonedge tends to just win for me
personally.

With the web interface, you don't really (yet) have a convenient way to do
"train to exhaustion", which leaves the other alternatives.  The defaults
really point towards "train on everything" (which may change) - but in the
Advanced Configuration you can set the buttons to default to other
categories ('discard', for example), to lean towards "mistake based
training".  IIRC, there was something added in 1.0a9 to help with nonedge
training, but I can't recall what it was - I'm sure it's there somewhere.

Hope this helps!

=Tony Meyer




More information about the Spambayes mailing list