From montanaro at users.sourceforge.net Wed Oct 10 15:17:47 2007 From: montanaro at users.sourceforge.net (montanaro at users.sourceforge.net) Date: Wed, 10 Oct 2007 06:17:47 -0700 Subject: [Spambayes-checkins] SF.net SVN: spambayes: [3162] trunk/website Message-ID: Revision: 3162 http://spambayes.svn.sourceforge.net/spambayes/?rev=3162&view=rev Author: montanaro Date: 2007-10-10 06:17:47 -0700 (Wed, 10 Oct 2007) Log Message: ----------- add info about thunderbayes Modified Paths: -------------- trunk/website/applications.ht trunk/website/faq.txt trunk/website/mac.ht trunk/website/unix.ht trunk/website/windows.ht Modified: trunk/website/applications.ht =================================================================== --- trunk/website/applications.ht 2007-09-21 01:48:38 UTC (rev 3161) +++ trunk/website/applications.ht 2007-10-10 13:17:47 UTC (rev 3162) @@ -64,6 +64,8 @@

Alternatively, to run from source, download the source archive.

Alternatively, use Subversion to get the code - go to the Subversion page on the project's SourceForge site for more.

+

For Thunderbird users the ThunderBayes extension provides tighter +integration between Thunderbird and sb_server.py. It is available separately.

sb_imapfilter.py

imapfilter connects to your imap server and marks mail as ham or spam, Modified: trunk/website/faq.txt =================================================================== --- trunk/website/faq.txt 2007-09-21 01:48:38 UTC (rev 3161) +++ trunk/website/faq.txt 2007-10-10 13:17:47 UTC (rev 3162) @@ -348,6 +348,19 @@ .. _this FAQ question: #how-do-i-set-up-spambayes-and-outlook-express +What is ThunderBayes? +--------------------- + +ThunderBayes (http://pieces.openpolitics.com/thunderbayes/) is an extension +for the Thunderbird email client. It provides a toolbar button similar to +Thunderbird's Junk button with which email can be classified as Spam or Ham. +Clicking the button causes two things to happen: (1) it sends the source of +the selected messages to SpamBayes to be classified and (2) it optionally +moves the messages to a folder of your choice (this can be configured in the +extension options). It includes a custom version of SpamBayes, and provides +a simple preference page in the Thunderbird Account Settings where the +SpamBayes POP3 proxy and message filters can be configured. + What clients will SpamBayes work with in general? ------------------------------------------------- Modified: trunk/website/mac.ht =================================================================== --- trunk/website/mac.ht 2007-09-21 01:48:38 UTC (rev 3161) +++ trunk/website/mac.ht 2007-10-10 13:17:47 UTC (rev 3162) @@ -14,4 +14,8 @@

  • If you're a Unix weenie using a Mac OS X system, the Unix/Linux notes are probably more appropriate than these notes. +
  • Thunderbird users might find the ThunderBayes extension + useful. It provides tighter integrateion between Thunderbird and the + SpamBayes POP3 proxy. \ No newline at end of file Modified: trunk/website/unix.ht =================================================================== --- trunk/website/unix.ht 2007-09-21 01:48:38 UTC (rev 3161) +++ trunk/website/unix.ht 2007-10-10 13:17:47 UTC (rev 3162) @@ -194,6 +194,13 @@ exit $RETVAL +

    Thunderbird

    + +

    Thunderbird users might find the ThunderBayes extension +useful. It provides tighter integrateion between Thunderbird and the +SpamBayes POP3 proxy.

    +

    KMail

    Toby Dickenson has written a description of his SpamBayes and KMail setup (using sb_bnfilter.py), Modified: trunk/website/windows.ht =================================================================== --- trunk/website/windows.ht 2007-09-21 01:48:38 UTC (rev 3161) +++ trunk/website/windows.ht 2007-10-10 13:17:47 UTC (rev 3162) @@ -77,6 +77,10 @@ installation program and use it to install a binary version of sb_server, including a tray application.

    See also the information about sb_server.

    +

    Thunderbird users might find the ThunderBayes extension +useful. It provides tighter integrateion between Thunderbird and the +SpamBayes POP3 proxy.

    If you retrieve mail via IMAP, you currently need to install a recent version of Python and @@ -87,4 +91,4 @@

    The 1.1 release (see the download page) includes a binary version of sb_imapfilter. Although this is still currently in alpha, -you might like to try it out.

    \ No newline at end of file +you might like to try it out.

    This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. From montanaro at users.sourceforge.net Wed Oct 10 15:20:44 2007 From: montanaro at users.sourceforge.net (montanaro at users.sourceforge.net) Date: Wed, 10 Oct 2007 06:20:44 -0700 Subject: [Spambayes-checkins] SF.net SVN: spambayes: [3163] trunk/website Message-ID: Revision: 3163 http://spambayes.svn.sourceforge.net/spambayes/?rev=3163&view=rev Author: montanaro Date: 2007-10-10 06:20:44 -0700 (Wed, 10 Oct 2007) Log Message: ----------- missing Modified Paths: -------------- trunk/website/mac.ht trunk/website/unix.ht trunk/website/windows.ht Modified: trunk/website/mac.ht =================================================================== --- trunk/website/mac.ht 2007-10-10 13:17:47 UTC (rev 3162) +++ trunk/website/mac.ht 2007-10-10 13:20:44 UTC (rev 3163) @@ -15,7 +15,7 @@ href="unix.html">Unix/Linux notes are probably more appropriate than these notes.
  • Thunderbird users might find the ThunderBayes extension + href="http://pieces.openpolitics.com/thunderbayes/">ThunderBayes extension useful. It provides tighter integrateion between Thunderbird and the SpamBayes POP3 proxy. \ No newline at end of file Modified: trunk/website/unix.ht =================================================================== --- trunk/website/unix.ht 2007-10-10 13:17:47 UTC (rev 3162) +++ trunk/website/unix.ht 2007-10-10 13:20:44 UTC (rev 3163) @@ -197,7 +197,7 @@

    Thunderbird

    Thunderbird users might find the ThunderBayes extension +href="http://pieces.openpolitics.com/thunderbayes/">ThunderBayes extension useful. It provides tighter integrateion between Thunderbird and the SpamBayes POP3 proxy.

    Modified: trunk/website/windows.ht =================================================================== --- trunk/website/windows.ht 2007-10-10 13:17:47 UTC (rev 3162) +++ trunk/website/windows.ht 2007-10-10 13:20:44 UTC (rev 3163) @@ -78,7 +78,7 @@ sb_server, including a tray application.

    See also the information about sb_server.

    Thunderbird users might find the ThunderBayes extension +href="http://pieces.openpolitics.com/thunderbayes/">ThunderBayes extension useful. It provides tighter integrateion between Thunderbird and the SpamBayes POP3 proxy.

    This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. From montanaro at users.sourceforge.net Mon Oct 22 04:29:04 2007 From: montanaro at users.sourceforge.net (montanaro at users.sourceforge.net) Date: Sun, 21 Oct 2007 19:29:04 -0700 Subject: [Spambayes-checkins] SF.net SVN: spambayes: [3164] trunk/spambayes/contrib/tte.py Message-ID: Revision: 3164 http://spambayes.svn.sourceforge.net/spambayes/?rev=3164&view=rev Author: montanaro Date: 2007-10-21 19:29:03 -0700 (Sun, 21 Oct 2007) Log Message: ----------- Use the better of the ratio requested by the user and the actual ratio in the spam and ham databases. Modified Paths: -------------- trunk/spambayes/contrib/tte.py Modified: trunk/spambayes/contrib/tte.py =================================================================== --- trunk/spambayes/contrib/tte.py 2007-10-10 13:20:44 UTC (rev 3163) +++ trunk/spambayes/contrib/tte.py 2007-10-22 02:29:03 UTC (rev 3164) @@ -114,10 +114,13 @@ hambone_ = list(reversed(hambone_)) spamcan_ = list(reversed(spamcan_)) + nspam,nham = len(spamcan_),len(hambone_) if ratio: rspam,rham = ratio - else: - rspam,rham = len(spamcan_),len(hambone_) + # If the actual ratio of spam to ham in the database is better than + # what was asked for, use that better ratio. + if (rspam > rham) == (rspam * nham > rham * nspam): + rspam,rham = nspam,nham # define some indexing constants ham = 0 This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. From montanaro at users.sourceforge.net Mon Oct 22 04:42:47 2007 From: montanaro at users.sourceforge.net (montanaro at users.sourceforge.net) Date: Sun, 21 Oct 2007 19:42:47 -0700 Subject: [Spambayes-checkins] SF.net SVN: spambayes: [3165] trunk/spambayes/spambayes/storage.py Message-ID: Revision: 3165 http://spambayes.svn.sourceforge.net/spambayes/?rev=3165&view=rev Author: montanaro Date: 2007-10-21 19:42:47 -0700 (Sun, 21 Oct 2007) Log Message: ----------- Isolate the code to safely write pickles so it can be used elsewhere in the system. From Dave Abrahams (SF patch 1816240). Modified Paths: -------------- trunk/spambayes/spambayes/storage.py Modified: trunk/spambayes/spambayes/storage.py =================================================================== --- trunk/spambayes/spambayes/storage.py 2007-10-22 02:29:03 UTC (rev 3164) +++ trunk/spambayes/spambayes/storage.py 2007-10-22 02:42:47 UTC (rev 3165) @@ -85,6 +85,36 @@ NO_UPDATEPROBS = False # Probabilities will not be autoupdated with training UPDATEPROBS = True # Probabilities will be autoupdated with training +def safe_pickle(filename, value, protocol=0): + '''Store value as a pickle without creating corruption''' + + # Be as defensive as possible. Always keep a safe copy. + tmp = filename + '.tmp' + fp = None + try: + fp = open(tmp, 'wb') + pickle.dump(value, fp, protocol) + fp.close() + except IOError, e: + if options["globals", "verbose"]: + print >> sys.stderr, 'Failed update: ' + str(e) + if fp is not None: + os.remove(tmp) + raise + try: + # With *nix we can just rename, and (as long as permissions + # are correct) the old file will vanish. With win32, this + # won't work - the Python help says that there may not be + # a way to do an atomic replace, so we rename the old one, + # put the new one there, and then delete the old one. If + # something goes wrong, there is at least a copy of the old + # one. + os.rename(tmp, filename) + except OSError: + os.rename(filename, filename + '.bak') + os.rename(tmp, filename) + os.remove(filename + '.bak') + class PickledClassifier(classifier.Classifier): '''Classifier object persisted in a pickle''' @@ -141,31 +171,7 @@ if options["globals", "verbose"]: print >> sys.stderr, 'Persisting',self.db_name,'as a pickle' - # Be as defensive as possible; keep always a safe copy. - tmp = self.db_name + '.tmp' - try: - fp = open(tmp, 'wb') - pickle.dump(self, fp, PICKLE_TYPE) - fp.close() - except IOError, e: - if options["globals", "verbose"]: - print >> sys.stderr, 'Failed update: ' + str(e) - if fp is not None: - os.remove(tmp) - raise - try: - # With *nix we can just rename, and (as long as permissions - # are correct) the old file will vanish. With win32, this - # won't work - the Python help says that there may not be - # a way to do an atomic replace, so we rename the old one, - # put the new one there, and then delete the old one. If - # something goes wrong, there is at least a copy of the old - # one. - os.rename(tmp, self.db_name) - except OSError: - os.rename(self.db_name, self.db_name + '.bak') - os.rename(tmp, self.db_name) - os.remove(self.db_name + '.bak') + safe_pickle(self.db_name, self, PICKLE_TYPE) def close(self): # we keep no resources open - nothing to do This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. From montanaro at users.sourceforge.net Mon Oct 22 04:44:15 2007 From: montanaro at users.sourceforge.net (montanaro at users.sourceforge.net) Date: Sun, 21 Oct 2007 19:44:15 -0700 Subject: [Spambayes-checkins] SF.net SVN: spambayes: [3166] trunk/spambayes/spambayes/classifier.py Message-ID: Revision: 3166 http://spambayes.svn.sourceforge.net/spambayes/?rev=3166&view=rev Author: montanaro Date: 2007-10-21 19:44:14 -0700 (Sun, 21 Oct 2007) Log Message: ----------- Use the new safe_pickle function. From Dave Abrahams (SF 1816240). Modified Paths: -------------- trunk/spambayes/spambayes/classifier.py Modified: trunk/spambayes/spambayes/classifier.py =================================================================== --- trunk/spambayes/spambayes/classifier.py 2007-10-22 02:42:47 UTC (rev 3165) +++ trunk/spambayes/spambayes/classifier.py 2007-10-22 02:44:14 UTC (rev 3166) @@ -652,17 +652,8 @@ # XXX becomes valid, for example). for name, data in [(self.bad_url_cache_name, self.bad_urls), (self.http_error_cache_name, self.http_error_urls),]: - # Save to a temp file first, in case something goes wrong. - cache = open(name + ".tmp", "w") - pickle.dump(data, cache) - cache.close() - try: - os.rename(name + ".tmp", name) - except OSError: - # Atomic replace isn't possible with win32, so just - # remove and rename. - os.remove(name) - os.rename(name + ".tmp", name) + from storage import safe_pickle + safe_pickle(name, data) def slurp(self, proto, url): # We generate these tokens: This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. From montanaro at users.sourceforge.net Mon Oct 22 04:45:57 2007 From: montanaro at users.sourceforge.net (montanaro at users.sourceforge.net) Date: Sun, 21 Oct 2007 19:45:57 -0700 Subject: [Spambayes-checkins] SF.net SVN: spambayes: [3167] trunk/spambayes/spambayes/dnscache.py Message-ID: Revision: 3167 http://spambayes.svn.sourceforge.net/spambayes/?rev=3167&view=rev Author: montanaro Date: 2007-10-21 19:45:57 -0700 (Sun, 21 Oct 2007) Log Message: ----------- Use the new safe_pickle function to write the cache file. Don't bomb if the cache file can't be read for some reason. From Dave Abrahams (SF 1816240). Modified Paths: -------------- trunk/spambayes/spambayes/dnscache.py Modified: trunk/spambayes/spambayes/dnscache.py =================================================================== --- trunk/spambayes/spambayes/dnscache.py 2007-10-22 02:44:14 UTC (rev 3166) +++ trunk/spambayes/spambayes/dnscache.py 2007-10-22 02:45:57 UTC (rev 3167) @@ -94,9 +94,15 @@ # end of user-settable attributes self.cachefile = os.path.expanduser(cachefile) + self.caches = None + if self.cachefile and os.path.exists(self.cachefile): - self.caches = pickle.load(open(self.cachefile, "rb")) - else: + try: + self.caches = pickle.load(open(self.cachefile, "rb")) + except: + os.unlink(self.cachefile) + + if self.caches is None: self.caches = {"A": {}, "PTR": {}} if options["globals", "verbose"]: @@ -123,7 +129,8 @@ if self.printStatsAtEnd: self.printStats() if self.cachefile: - pickle.dump(self.caches, open(self.cachefile, "wb")) + from storage import safe_pickle + safe_pickle(self.cachefile, self.caches) def printStats(self): for key,val in self.caches.items(): This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. From montanaro at users.sourceforge.net Mon Oct 22 04:51:08 2007 From: montanaro at users.sourceforge.net (montanaro at users.sourceforge.net) Date: Sun, 21 Oct 2007 19:51:08 -0700 Subject: [Spambayes-checkins] SF.net SVN: spambayes: [3168] trunk/spambayes/spambayes/hammie.py Message-ID: Revision: 3168 http://spambayes.svn.sourceforge.net/spambayes/?rev=3168&view=rev Author: montanaro Date: 2007-10-21 19:51:07 -0700 (Sun, 21 Oct 2007) Log Message: ----------- Don't attempt to store Hammie objects whose files were opened for reading. Modified Paths: -------------- trunk/spambayes/spambayes/hammie.py Modified: trunk/spambayes/spambayes/hammie.py =================================================================== --- trunk/spambayes/spambayes/hammie.py 2007-10-22 02:45:57 UTC (rev 3167) +++ trunk/spambayes/spambayes/hammie.py 2007-10-22 02:51:07 UTC (rev 3168) @@ -21,8 +21,9 @@ """ - def __init__(self, bayes): + def __init__(self, bayes, mode): self.bayes = bayes + self.mode = mode def _scoremsg(self, msg, evidence=False): """Score a Message. @@ -266,7 +267,8 @@ self.bayes.store() def close(self): - self.store() + if self.mode != 'r': + self.store() def open(filename, useDB="dbm", mode='r'): """Open a file, returning a Hammie instance. @@ -274,7 +276,7 @@ mode is used as the flag to open DBDict objects. 'c' for read-write (create if needed), 'r' for read-only, 'w' for read-write. """ - return Hammie(storage.open_storage(filename, useDB, mode)) + return Hammie(storage.open_storage(filename, useDB, mode), mode) if __name__ == "__main__": This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.