[spambayes-bugs] [ spambayes-Bugs-887453 ] Crash when training on messages

SourceForge.net noreply at sourceforge.net
Sun May 2 22:55:09 EDT 2004


Bugs item #887453, was opened at 2004-01-30 17:53
Message generated for change (Comment added) made by anadelonbrin
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=887453&group_id=61702

Category: None
Group: Source code 1.0a7
>Status: Closed
>Resolution: Out of Date
Priority: 5
Submitted By: Frank Solensky (fsolensky)
Assigned to: Nobody/Anonymous (nobody)
Summary: Crash when training on messages

Initial Comment:
I've been running into the following error for a day or
two now on different messages.  While I haven't
narrowed it down to a particular message yet, the
modules and line numbers appear to be the same each
time.  Stopping and restarting spambayes several times
seems to get the offending message to get out of the
way somehow..

Will try to narrow it down a bit more over the weekend
if it doesn't ring a familiar bell by then..



----------------------------------------------------------------------

>Comment By: Tony Meyer (anadelonbrin)
Date: 2004-05-03 14:55

Message:
Logged In: YES 
user_id=552329

No response, so closing.  Please reopen (with the requested
information) if this still occurs.

----------------------------------------------------------------------

Comment By: Tony Meyer (anadelonbrin)
Date: 2004-02-17 15:57

Message:
Logged In: YES 
user_id=552329

Weird.  You can use db_dbexpimp.py to convert the db to 
text, but I doubt that will turn up anything of use (unless it 
fails, in which case the db must be screwed).  You can also 
use it to convert between various db formats - maybe 
whichever db this is using isn't good on your machine?  
Switching to a pickle would solve this, although then you're 
using a pickle...

Which sort of dbm is it using?  bsddb?  gdbm?  (whichdb.py in 
the contrib directory should answer that).  Maybe trying 
another one would help?  (The option to do this isn't exposed 
via the web interface, IIRC, so you'll have to edit the config 
file manually).

----------------------------------------------------------------------

Comment By: Frank Solensky (fsolensky)
Date: 2004-01-31 13:18

Message:
Logged In: YES 
user_id=205922

Update: it doesn't appear to be limited to a particular
word.  I've been marking all messages as 'defer' and
admitting a few at a time to see exactly when the crashes
occur.  The last word displayed has been different each time
("dithered", "ux-50,",
"url:draft-ietf-avt-uncomp-video-05").  I'm not sure, but I
think that the message that triggers the error may disappear
once I've restarted the daemon as well..

If it's the db, are there any tools to look for errors?  Or
do I have to reinstall and retrain?


----------------------------------------------------------------------

Comment By: Frank Solensky (fsolensky)
Date: 2004-01-31 03:16

Message:
Logged In: YES 
user_id=205922

Here's the last one displayed and a stack trace that went to
that window.  The web interface shows the same trace as before.
---------------------------------
wordinfoget: word= subject:Korean
Traceback (most recent call last):
  File
"/usr/src/redhat/SOURCES/spambayes-1.0a7/scripts/sb_server.py",
line 442, in onRetr
    evidence=True)
  File
"/usr/lib/python2.2/site-packages/spambayes/classifier.py",
line 158, in chi2_spamprob
    clues = self._getclues(wordstream)
  File
"/usr/lib/python2.2/site-packages/spambayes/classifier.py",
line 391, in _getclues
    record = self._wordinfoget(word)
  File
"/usr/lib/python2.2/site-packages/spambayes/storage.py",
line 260, in _wordinfoget
    r = self.db.get(word)
  File "/usr/lib/python2.2/shelve.py", line 65, in get
    if self.dict.has_key(key):
error: (22, 'Invalid argument')


----------------------------------------------------------------------

Comment By: Tony Meyer (anadelonbrin)
Date: 2004-01-30 18:00

Message:
Logged In: YES 
user_id=552329

I don't recall seeing this before, but it looks like it's either a 
problem with a particular token, or that something is wrong 
with the database.

You could try examining which key is giving problems.  If you 
add "print word" before line 259 of storage.py, that will print 
out the tokens looked up, and the last one before the crash 
will be the problem one (if it's always the same, otherwise 
that's probably not the problem).

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=887453&group_id=61702



More information about the Spambayes-bugs mailing list