[Spambayes] Problem with SpamBayes 1.0a9 & procmail
Mike Causer
mikec at mikecauser.org
Mon Feb 16 14:08:09 EST 2004
Problem in using SpamBayes 1.0a9 (Python 2.3 / Mandrake Linux 9.2)
The installation is a fresh one, with a hammiedb created today from 407
spams and 494 hams. .spambayesrc looks like this:
[Storage]
persistent_use_database=True
persistent_storage_file=~/.hammiedb
[Headers]
include_evidence=True
include_score=True
Although the original problem showed up when running through procmail,
filtering by command line yields the same results, so that's what I'll
quote.
Running sb_filter on an mbox file gets:
[mikec at lugh mikec]$ /usr/bin/sb_filter.py < /var/spool/mail/mikec
Traceback (most recent call last):
File "/usr/bin/sb_filter.py", line 239, in ?
main()
File "/usr/bin/sb_filter.py", line 231, in main
action(msg)
File "/usr/bin/sb_filter.py", line 163, in filter
return h.filter(msg)
File "/usr/lib/python2.3/site-packages/spambayes/hammie.py", line 109, in filter
prob, clues = self._scoremsg(msg, True)
File "/usr/lib/python2.3/site-packages/spambayes/hammie.py", line 38, in _scoremsg
return self.bayes.spamprob(tokenize(msg), evidence)
File "/usr/lib/python2.3/site-packages/spambayes/classifier.py", line 190, in chi2_spamprob
clues = self._getclues(wordstream)
File "/usr/lib/python2.3/site-packages/spambayes/classifier.py", line 493, in _getclues
tup = self._worddistanceget(word)
File "/usr/lib/python2.3/site-packages/spambayes/classifier.py", line 508, in _worddistanceget
prob = self.probability(record)
File "/usr/lib/python2.3/site-packages/spambayes/classifier.py", line 308, in probability
assert hamcount <= nham
AssertionError
[mikec at lugh mikec]$
The AssertionError occurs on the last message in the input (found after
inserting a few print statements in classifier.py).
This is the same result for any input file whether single ham, single
spam or a mixture of both, except that zero length input passes OK:
[mikec at lugh mikec]$ /usr/bin/sb_filter.py
^D
X-Spambayes-Classification: unsure; 0.49
X-Spambayes-Evidence: '*H*': 0.45; '*S*': 0.42; 'reply-to:none': 0.29;
'content-type:text/plain': 0.37; 'sender:none': 0.79
[mikec at lugh mikec]$
I do have a suspicion that the .spambayesrc might not be complete, but
nothing leaps off the screen while reading Options.py
This would be a good opportunity to get myself back up to speed on
Python after a gap of a few years perhaps, but it would be nice to get
rid of the spam first ;-)
Mike
--
Mike Causer Email - mailto:mikec at mikecauser.org
GPG KeyID 1C2DDA07 WWW - http://www.mikecauser.org
Flood the fen again! - Wicken Fen enlargement - http://www.wicken.org.uk
More information about the Spambayes
mailing list