[Spambayes-checkins] spambayes/spambayes storage.py,1.13,1.14
Skip Montanaro
montanaro at users.sourceforge.net
Mon Jun 30 19:19:53 EDT 2003
Update of /cvsroot/spambayes/spambayes/spambayes
In directory sc8-pr-cvs1:/tmp/cvs-serv15492
Modified Files:
storage.py
Log Message:
Encode unicode objects as utf-8 before using as a key for DBDictClassifier
instances.
Index: storage.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/spambayes/storage.py,v
retrieving revision 1.13
retrieving revision 1.14
diff -C2 -d -r1.13 -r1.14
*** storage.py 27 Jun 2003 00:45:21 -0000 1.13
--- storage.py 1 Jul 2003 01:19:51 -0000 1.14
***************
*** 208,211 ****
--- 208,213 ----
def _wordinfoget(self, word):
+ if isinstance(word, unicode):
+ word = word.encode("utf-8")
try:
return self.wordinfo[word]
***************
*** 230,233 ****
--- 232,237 ----
# as much as 60%!!! This also has the effect of reducing the time it
# takes to store the database
+ if isinstance(word, unicode):
+ word = word.encode("utf-8")
if record.spamcount + record.hamcount <= 1:
self.db[word] = record.__getstate__()
***************
*** 243,246 ****
--- 247,252 ----
def _wordinfodel(self, word):
+ if isinstance(word, unicode):
+ word = word.encode("utf-8")
del self.wordinfo[word]
self.changed_words[word] = WORD_DELETED
More information about the Spambayes-checkins
mailing list