[spambayes-bugs] [ spambayes-Bugs-852137 ] sb_imapfilter.py AssertionError: hamcount <= nham

SourceForge.net noreply at sourceforge.net
Tue Dec 2 20:57:23 EST 2003


Bugs item #852137, was opened at 2003-12-02 04:39
Message generated for change (Comment added) made by anadelonbrin
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=852137&group_id=61702

Category: imapfilter
Group: Source code - CVS
>Status: Open
>Resolution: None
Priority: 5
Submitted By: Tony Lownds (tonylownds)
Assigned to: Tony Meyer (anadelonbrin)
Summary: sb_imapfilter.py AssertionError: hamcount <= nham

Initial Comment:
When I classify through sb_imapfilter.py, I am getting an 
AssertionError. Any ideas? I am using spambayes from CVS; 
courier IMAP; python 2.2.2; and a just-deleted database. See 
below for commands, and further below for database dumps.

[tony ~]$ rm hammie.db spambayes.messageinfo.db
[tony ~]$ /usr/bin/sb_imapfilter.py -t
SpamBayes IMAP Filter Beta1, version 0.1 (September 
2003),
using SpamBayes IMAP Filter Web Interface Alpha2, version 
0.02
and engine SpamBayes Beta2, version 0.2 (July 2003).

Loading state from hammie.db database
hammie.db is a new database
Loading database hammie.db... Done.
Training
   Training ham folder INBOX.Ham
**************       14 trained.
   Training spam folder INBOX.Spam
********************************************       44 
trained.
Persisting hammie.db state in database
Training took 2.87554502487 seconds, 58 messages were 
trained
[tony ~]$ /usr/bin/sb_imapfilter.py -c
SpamBayes IMAP Filter Beta1, version 0.1 (September 
2003),
using SpamBayes IMAP Filter Web Interface Alpha2, version 
0.02
and engine SpamBayes Beta2, version 0.2 (July 2003).

Loading state from hammie.db database
hammie.db is an existing database, with 44 spam and 10 
ham
Loading database hammie.db... Done.
Classifying *.Traceback (most recent call last):
  File "/usr/bin/sb_imapfilter.py", line 821, in ?
    run()
  File "/usr/bin/sb_imapfilter.py", line 811, in run
    imap_filter.Filter()
  File "/usr/bin/sb_imapfilter.py", line 676, in Filter
    self.unsure_folder)
  File "/usr/bin/sb_imapfilter.py", line 595, in Filter
    evidence=True)
  File "/usr/lib/python2.2/site-packages/spambayes/
classifier.py", line 158, in chi2_spamprob
    clues = self._getclues(wordstream)
  File "/usr/lib/python2.2/site-packages/spambayes/
classifier.py", line 395, in _getclues
    prob = self.probability(record)
  File "/usr/lib/python2.2/site-packages/spambayes/
classifier.py", line 242, in probability
    assert hamcount <= nham
AssertionError



----------------------------------------------------------------------

>Comment By: Tony Meyer (anadelonbrin)
Date: 2003-12-03 14:57

Message:
Logged In: YES 
user_id=552329

Sorry, I missed that (was that in the mailing list post, too?  I 
must have missed it twice).

So there's something wrong then with it reporting that it 
trained on 58 messages, but having the db only have 54.  If 
you use the web interface and look at the "stats" page, how 
many messages does it report there?  (That goes off the 
messageinfo db rather than hammie.db).

If you use the move_trained_[sp|h]am_to_folder options, do 
all the messages get moved?

I'm not sure whether the problem here is that it's not actually 
training all those messages, or that something is going wrong 
saving the increased count to the hammie.db.

----------------------------------------------------------------------

Comment By: Tony Lownds (tonylownds)
Date: 2003-12-03 08:24

Message:
Logged In: YES 
user_id=24100

I think this should be re-opened. Please look at the command 
below, included in the original report, carefully:

[tony ~]$ rm hammie.db spambayes.messageinfo.db

That command removes hammie.db as well.


----------------------------------------------------------------------

Comment By: Tony Meyer (anadelonbrin)
Date: 2003-12-02 12:34

Message:
Logged In: YES 
user_id=552329

As per Tim Stone's message on spambayes at python.org:

Removing the messageinfo db and not the stats db is the 
*cause*of this problem.  imapfilter relies on the messageinfo 
db to tell it which messages it should train on and which it 
has already processed.  By deleting that, but not your stats 
(hammie) db, you're in for all sorts of trouble.  You need to 
delete both if you want to start afresh.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=852137&group_id=61702



More information about the Spambayes-bugs mailing list