[Spambayes] Spambates Exceptions

Brent Easton b.easton at exemail.com.au
Fri Nov 26 01:44:09 CET 2004


Hi Tony,

Thanks a lot for that. I ended up peeking at the source and finding the assert statement. I don't know python, but I guessed it looked like a corrupted db. I have decided to retrain.

Thanks again,
Brent.


*********** REPLY SEPARATOR  ***********

On 26/11/2004 at 1:20 PM Tony Meyer  wrote:

>> Have recently upgrades to v1.0 spambayes. Using Calypso mail 
>> client with spambayes proxy server. I now get the following 
>> header added to every email:
>> 
>> X-Spambayes-Exception: Traceback (most recent call last): . 
>> File "sb_server.pyc", line 475, in onRetr . File 
>> "spambayes\classifier.pyc", line 190, in chi2_spamprob . File 
>> "spambayes\classifier.pyc", line 493, in _getclues . File 
>> "spambayes\classifier.pyc", line 508, in _worddistanceget . 
>> File "spambayes\classifier.pyc", line 308, in probability 
>> .AssertionError
>
>It's very rare to see this error these days.  What it means it that your
>database has an entry that has been seen in more ham than the total number
>of ham messages you have trained (obviously impossible).  The only cause I
>can think of for this is that training was interrupted at just the right
>(or
>wrong, depending on your point of view) moment.
>
>There are two options to fix this:
>
>  1.  The best option, by far, is to retrain from scratch.  Delete the two
>database files and start training again - training is very quick, so it'll
>hardly take any time to get back to high accuracy.  The FAQ explains how to
>go about doing this.
>
>  2.  You can manually repair this error in the database.  You'd have to
>install Python (if you haven't already) and use the source version.  You
>use
>the sb_dbexpimp.py script to convert the database to a CSV file, which you
>can then open (eg in Excel) and change the numbers at the top to be at
>least
>as large as the largest values in each of the ham & spam columns.  You then
>use the sb_dbexpimp.py script to convert the fixed database back to the
>original format.  We don't recommend this, as the problem might indicate
>larger issues with the database, so it's much more reliable to just
>retrain.
>
>Sorry the news isn't better!
>
>=Tony.Meyer
>
>-- 
>Please always include the list (spambayes at python.org) in your replies
>(reply-all), and please don't send me personal mail about SpamBayes.
>http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this.


____________________________________________________________
Brent Easton                       
Analyst/Programmer                               
University of Western Sydney                                   
Email: b.easton at uws.edu.au



More information about the Spambayes mailing list