[Spambayes-checkins] spambayes/Outlook2000 train.py,1.16,1.17
Mark Hammond
mhammond@skippinet.com.au
Wed Nov 13 07:01:59 2002
> Log Message:
> train_message(): When rescoring was asked for, it had no visible
> effect, since the probabilities didn't get updated after training.
> So update the probs before rescoring.
I'm a little confused about these probabilities.
Isn't it true that whenever we do a "train operation", we should also update
the probabilities? For a batch train, we only want to do it at the end, but
for an individual, incremental train, I would have thought we still want the
probabilities updated, even if we don't rescore the message. Otherwise
future messages will not use the new probabilities.
I ask because revision 1.14 did exactly this, and we regressed it. That
revision was:
diff -r1.13 -r1.14
21c21
< def train_message(msg, is_spam, mgr, update_probs = True):
---
> def train_message(msg, is_spam, mgr):
43,45d42
< if update_probs:
< mgr.bayes.update_probabilities()
<
56c53
< if train_message(message, isspam, mgr, False):
---
> if train_message(message, isspam, mgr):
And it seems to me that a new param, specifically for update_probs, is less
of a hack than tieing it to the "rescore" param - we want the new probs used
for the *next* incoming message even if we don't need it for *this* message.
Mark.
More information about the Spambayes-checkins
mailing list