[Spambayes] I seem to have stumbled upon a (persistent) spamcount > nspam bug

Mon Sep 11 15:46:52 CEST 2006

ls,

   after getting these messages:
-------------------------------8<---------------------
fetchmail: reading message <mailserver>:19 of 19 (4326 octets)  
....Traceback (most recent call last):
   File "/usr/bin/sb_filter.py", line 257, in ?
     main()
   File "/usr/bin/sb_filter.py", line 248, in main
     action(msg)
   File "/usr/bin/sb_filter.py", line 180, in filter
     return self.h.filter(msg)
   File "/usr/lib/.../spambayes/hammie.py", line 109, in filter
     prob, clues = self._scoremsg(msg, True)
   File "/usr/lib/.../spambayes/hammie.py", line 38, in _scoremsg
     return self.bayes.spamprob(tokenize(msg), evidence)
   File "/usr/lib/.../spambayes/classifier.py", line 190, in  
chi2_spamprob
     clues = self._getclues(wordstream)
   File "/usr/lib/.../spambayes/classifier.py", line 493, in _getclues
     tup = self._worddistanceget(word)
   File "/usr/lib/.../spambayes/classifier.py", line 508, in  
_worddistanceget
     prob = self.probability(record)
   File "/usr/lib/.../spambayes/classifier.py", line 311, in probability
     assert spamcount <= nspam
AssertionError
-------------------------------8<---------------------
NB ... = python2.4/site-packages

And after browsing (and googling) the internet, the above error seems
to point to a corrupt DB file. Even after recreating this DB file
from scratch *and* after updating to spambayes-1.1a1 and recreating
the same DB file, this problem keeps existing. The platform spambayes
is running on is (Intel pentium) Fedora Core 5. I've used the following
comnmand to recreate the DB file:
-------------------------------8<---------------------
$ # Empty hammie DB
$ echo -n > /usr/local/share/spambayes/hammie.db
$ # Retrain
$ sb_mboxtrain.py -f -d /usr/local/share/spambayes/hammie.db -g \
  $HOME/EMail/Inbox/inbox -g $HOME/EMail/Inbox/NeoNixie -g \
  $HOME/EMail/Inbox/SpareTimeGizmo -g $HOME/EMail/Inbox/Evolution -s \
  $HOME/EMail/spam/spam
-------------------------------8<---------------------
NB I've specified the -f option to force a retrain of already trained
    messages.

Can anyone of you shed some light on this issue?

MTIA, cu l8r, Edgar.
-- 
        \|||/
        (o o)                                           Just curious...
----ooO-(_)-Ooo-------------------------------------------------------