[Spambayes-checkins] spambayes/Outlook2000 train.py,1.16,1.17

Mark Hammond mhammond@skippinet.com.au
Wed Nov 13 07:01:59 2002


> Log Message:
> train_message():  When rescoring was asked for, it had no visible
> effect, since the probabilities didn't get updated after training.
> So update the probs before rescoring.

I'm a little confused about these probabilities.

Isn't it true that whenever we do a "train operation", we should also update
the probabilities?  For a batch train, we only want to do it at the end, but
for an individual, incremental train, I would have thought we still want the
probabilities updated, even if we don't rescore the message.  Otherwise
future messages will not use the new probabilities.

I ask because revision 1.14 did exactly this, and we regressed it.  That
revision was:

diff -r1.13 -r1.14
21c21
< def train_message(msg, is_spam, mgr, update_probs = True):
---
> def train_message(msg, is_spam, mgr):
43,45d42
<     if update_probs:
<         mgr.bayes.update_probabilities()
<
56c53
<             if train_message(message, isspam, mgr, False):
---
>             if train_message(message, isspam, mgr):

And it seems to me that a new param, specifically for update_probs, is less
of a hack than tieing it to the "rescore" param - we want the new probs used
for the *next* incoming message even if we don't need it for *this* message.

Mark.




More information about the Spambayes-checkins mailing list