[Spambayes] AssertionError: assert spamcount <= nspam ??

Tony Meyer tameyer at ihug.co.nz
Tue May 3 06:58:22 CEST 2005


> I was checking out the state of spam filtering on my servers 
> today and noticed in the logs a lot of the following errors:
> 
> Traceback (most recent call last):
[...]
>      assert spamcount <= nspam
> AssertionError
[...]

What this means is that there are one or more tokens that have been seen in
more spam messages than you have trained (obviously impossible).  This error
is pretty uncommon these days - the most likely way for it to occur is if
writing the database was somehow interrupted (but the database itself didn't
get corrupted).  Or, since you mention upgrading, maybe this was caused by
an old bug that's been fixed (depending on how old the version you were
using is).

> I even tried to upgrade to the latest 1.0.4 version and it's still 
> happening.

Once it's happened, it will continue to happen (for any message that has the
bad tokens in it) until the database is fixed.  Hopefully upgrading will
prevent it happening again, though.

There are two ways to fix this problem:

  * Remove the existing database and retrain from scratch (recommended,
since there might be other problems with the database, which this would
fix).

  * Convert the database to CSV (with the sb_dbexpimp.py script), open it in
a text editor or spreadsheet, and change the initial two numbers to be
greater than or equal to the numbers in the ham/spam columns (that should
make more sense once you're looking at the file).

=Tony.Meyer

-- 
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes.
http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this. 



More information about the Spambayes mailing list