[spambayes-bugs] [ spambayes-Bugs-887453 ] Crash when training on
messages
SourceForge.net
noreply at sourceforge.net
Sun May 2 22:55:09 EDT 2004
Bugs item #887453, was opened at 2004-01-30 17:53
Message generated for change (Comment added) made by anadelonbrin
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=887453&group_id=61702
Category: None
Group: Source code 1.0a7
>Status: Closed
>Resolution: Out of Date
Priority: 5
Submitted By: Frank Solensky (fsolensky)
Assigned to: Nobody/Anonymous (nobody)
Summary: Crash when training on messages
Initial Comment:
I've been running into the following error for a day or
two now on different messages. While I haven't
narrowed it down to a particular message yet, the
modules and line numbers appear to be the same each
time. Stopping and restarting spambayes several times
seems to get the offending message to get out of the
way somehow..
Will try to narrow it down a bit more over the weekend
if it doesn't ring a familiar bell by then..
----------------------------------------------------------------------
>Comment By: Tony Meyer (anadelonbrin)
Date: 2004-05-03 14:55
Message:
Logged In: YES
user_id=552329
No response, so closing. Please reopen (with the requested
information) if this still occurs.
----------------------------------------------------------------------
Comment By: Tony Meyer (anadelonbrin)
Date: 2004-02-17 15:57
Message:
Logged In: YES
user_id=552329
Weird. You can use db_dbexpimp.py to convert the db to
text, but I doubt that will turn up anything of use (unless it
fails, in which case the db must be screwed). You can also
use it to convert between various db formats - maybe
whichever db this is using isn't good on your machine?
Switching to a pickle would solve this, although then you're
using a pickle...
Which sort of dbm is it using? bsddb? gdbm? (whichdb.py in
the contrib directory should answer that). Maybe trying
another one would help? (The option to do this isn't exposed
via the web interface, IIRC, so you'll have to edit the config
file manually).
----------------------------------------------------------------------
Comment By: Frank Solensky (fsolensky)
Date: 2004-01-31 13:18
Message:
Logged In: YES
user_id=205922
Update: it doesn't appear to be limited to a particular
word. I've been marking all messages as 'defer' and
admitting a few at a time to see exactly when the crashes
occur. The last word displayed has been different each time
("dithered", "ux-50,",
"url:draft-ietf-avt-uncomp-video-05"). I'm not sure, but I
think that the message that triggers the error may disappear
once I've restarted the daemon as well..
If it's the db, are there any tools to look for errors? Or
do I have to reinstall and retrain?
----------------------------------------------------------------------
Comment By: Frank Solensky (fsolensky)
Date: 2004-01-31 03:16
Message:
Logged In: YES
user_id=205922
Here's the last one displayed and a stack trace that went to
that window. The web interface shows the same trace as before.
---------------------------------
wordinfoget: word= subject:Korean
Traceback (most recent call last):
File
"/usr/src/redhat/SOURCES/spambayes-1.0a7/scripts/sb_server.py",
line 442, in onRetr
evidence=True)
File
"/usr/lib/python2.2/site-packages/spambayes/classifier.py",
line 158, in chi2_spamprob
clues = self._getclues(wordstream)
File
"/usr/lib/python2.2/site-packages/spambayes/classifier.py",
line 391, in _getclues
record = self._wordinfoget(word)
File
"/usr/lib/python2.2/site-packages/spambayes/storage.py",
line 260, in _wordinfoget
r = self.db.get(word)
File "/usr/lib/python2.2/shelve.py", line 65, in get
if self.dict.has_key(key):
error: (22, 'Invalid argument')
----------------------------------------------------------------------
Comment By: Tony Meyer (anadelonbrin)
Date: 2004-01-30 18:00
Message:
Logged In: YES
user_id=552329
I don't recall seeing this before, but it looks like it's either a
problem with a particular token, or that something is wrong
with the database.
You could try examining which key is giving problems. If you
add "print word" before line 259 of storage.py, that will print
out the tokens looked up, and the last one before the crash
will be the problem one (if it's always the same, otherwise
that's probably not the problem).
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=887453&group_id=61702
More information about the Spambayes-bugs
mailing list