[Spambayes-checkins] spambayes/spambayes storage.py,1.15,1.16

Tim Peters tim_one at users.sourceforge.net
Thu Jul 24 23:17:24 EDT 2003


Update of /cvsroot/spambayes/spambayes/spambayes
In directory sc8-pr-cvs1:/tmp/cvs-serv12238/spambayes

Modified Files:
	storage.py 
Log Message:
SF bug 777026: Possible cause for db corruption in DBDictClassifier.
The _wordinfoset method of a DBDictClassifier implicitly relied on that
a classifier always called it with the WordInfo record already associated
with a word (assuming any such existed).  In fact, all calls from a
classifier do that, and there wasn't actually a bug here so far as
classifiers go.  I don't know about all other uses of _wordinfoset,
though, and it seems unbearably delicate regardless.

So fixed that; added a comment about why a KeyError can be expected to
occur in the preceding try/except; and added a (necessarily kinda funky)
new test_bug777026 test case, which fails before this patch when a
DBDictClassifier is used.


Index: storage.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/spambayes/storage.py,v
retrieving revision 1.15
retrieving revision 1.16
diff -C2 -d -r1.15 -r1.16
*** storage.py	8 Jul 2003 11:39:10 -0000	1.15
--- storage.py	25 Jul 2003 05:17:22 -0000	1.16
***************
*** 236,245 ****
          if record.spamcount + record.hamcount <= 1:
              self.db[word] = record.__getstate__()
-             # Remove this word from the changed list (not that it should be
-             # there, but strange things can happen :)
              try:
                  del self.changed_words[word]
              except KeyError:
                  pass
          else:
              self.wordinfo[word] = record
--- 236,251 ----
          if record.spamcount + record.hamcount <= 1:
              self.db[word] = record.__getstate__()
              try:
                  del self.changed_words[word]
              except KeyError:
+                 # This can happen if, e.g., a new word is trained as ham
+                 # twice, then untrained once, all before a store().
+                 pass
+ 
+             try:
+                 del self.wordinfo[word]
+             except KeyError:
                  pass
+ 
          else:
              self.wordinfo[word] = record





More information about the Spambayes-checkins mailing list