[Spambayes] Problem with SpamBayes 1.0a9 & procmail

Mike Causer mikec at mikecauser.org
Mon Feb 16 14:08:09 EST 2004


Problem in using SpamBayes 1.0a9 (Python 2.3 / Mandrake Linux 9.2)

The installation is a fresh one, with a hammiedb created today from 407
spams and 494 hams.  .spambayesrc looks like this:
   
   [Storage]
   persistent_use_database=True
   persistent_storage_file=~/.hammiedb
   [Headers]
   include_evidence=True
   include_score=True

Although the original problem showed up when running through procmail,
filtering by command line yields the same results, so that's what I'll
quote.  

Running sb_filter on an mbox file gets:
   [mikec at lugh mikec]$ /usr/bin/sb_filter.py < /var/spool/mail/mikec
   Traceback (most recent call last):
   File "/usr/bin/sb_filter.py", line 239, in ?
      main()
   File "/usr/bin/sb_filter.py", line 231, in main
      action(msg)
   File "/usr/bin/sb_filter.py", line 163, in filter
      return h.filter(msg)
   File "/usr/lib/python2.3/site-packages/spambayes/hammie.py", line 109, in filter
      prob, clues = self._scoremsg(msg, True)
   File "/usr/lib/python2.3/site-packages/spambayes/hammie.py", line 38, in _scoremsg
      return self.bayes.spamprob(tokenize(msg), evidence)
   File "/usr/lib/python2.3/site-packages/spambayes/classifier.py", line 190, in chi2_spamprob
      clues = self._getclues(wordstream)
   File "/usr/lib/python2.3/site-packages/spambayes/classifier.py", line 493, in _getclues
      tup = self._worddistanceget(word)
   File "/usr/lib/python2.3/site-packages/spambayes/classifier.py", line 508, in _worddistanceget
      prob = self.probability(record)
   File "/usr/lib/python2.3/site-packages/spambayes/classifier.py", line 308, in probability
      assert hamcount <= nham
   AssertionError
   [mikec at lugh mikec]$

The AssertionError occurs on the last message in the input (found after 
inserting a few print statements in classifier.py).

This is the same result for any input file whether single ham, single
spam or a mixture of both, except that zero length input passes OK:

   [mikec at lugh mikec]$ /usr/bin/sb_filter.py
   ^D
   X-Spambayes-Classification: unsure; 0.49
   X-Spambayes-Evidence: '*H*': 0.45; '*S*': 0.42; 'reply-to:none': 0.29;
        'content-type:text/plain': 0.37; 'sender:none': 0.79

   [mikec at lugh mikec]$


I do have a suspicion that the .spambayesrc might not be complete, but
nothing leaps off the screen while reading Options.py 


This would be a good opportunity to get myself back up to speed on
Python after a gap of a few years perhaps, but it would be nice to get
rid of the spam first  ;-)



Mike
-- 
Mike Causer                          Email - mailto:mikec at mikecauser.org
GPG KeyID 1C2DDA07                       WWW - http://www.mikecauser.org
Flood the fen again! - Wicken Fen enlargement - http://www.wicken.org.uk



More information about the Spambayes mailing list