[Spambayes] [ spambayes-Bugs-715248 ] Pickle classifier should save
to a temp file first
SourceForge.net
noreply at sourceforge.net
Fri Apr 4 13:09:21 EST 2003
Bugs item #715248, was opened at 2003-04-04 14:01
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=715248&group_id=61702
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Mark Hammond (mhammond)
Assigned to: Tim Stone (timstone4)
Summary: Pickle classifier should save to a temp file first
Initial Comment:
A number of "EOF Error"s could be avoided if the pickle
classifier saved to a temp file first, then renamed to
the real file. Otherwise, failure during save can lose
the database. This came up in
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=709051&group_id=61702
----------------------------------------------------------------------
Comment By: Simone Piunno (pioppo)
Date: 2003-04-04 23:09
Message:
Logged In: YES
user_id=227443
sorry, typo...
the 3rd "try" is "except OSError, e", like the 2nd one.
----------------------------------------------------------------------
Comment By: Tim Stone (timstone4)
Date: 2003-04-04 23:02
Message:
Logged In: YES
user_id=645698
I agree. This was how it did its thing to start with... <sigh>. I'll put it back in.
----------------------------------------------------------------------
Comment By: Simone Piunno (pioppo)
Date: 2003-04-04 22:13
Message:
Logged In: YES
user_id=227443
--- spambayes/storage.py.orig 2003-04-03
23:35:47.000000000 +0200
+++ spambayes/storage.py 2003-04-04
21:51:25.000000000 +0200
@@ -59,6 +59,9 @@
import cPickle as pickle
import errno
import shelve
+import sys
+import os
+import random
from spambayes import dbmstorage
# Make shelve use binary pickles by default.
@@ -121,10 +124,31 @@
if options.verbose:
print 'Persisting',self.db_name,'as a pickle'
- fp = open(self.db_name, 'wb')
- pickle.dump(self, fp, PICKLE_TYPE)
- fp.close()
-
+ # Be as defensive as possible, keep always a safe copy.
+ rand = random.randrange(0, sys.maxint)
+ tmp = self.db_name + '.%d.%d.tmp' % (rand, os.getpid())
+ last = self.db_name + '.bak'
+ fp = None
+ try:
+ fp = open(tmp, 'wb')
+ pickle.dump(self, fp, PICKLE_TYPE)
+ fp.close()
+ except IOError, e:
+ if options.verbose:
+ print 'Failed update: ' + e
+ if fp is not None:
+ os.unlink(tmp)
+ raise
+ try:
+ os.unlink(last)
+ except OSError, e:
+ if e.errno <> errno.ENOENT: raise
+ try:
+ os.link(self.db_name, last)
+ except:
+ if e.errno <> errno.ENOENT: raise
+ os.rename(tmp, self.db_name)
+
class DBDictClassifier(classifier.Classifier):
'''Classifier object persisted in a caching database'''
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=715248&group_id=61702
More information about the Spambayes
mailing list