Spambayes + HTTP proxy server
Skip Montanaro
skip at pobox.com
Sun Feb 2 13:27:05 EST 2003
Sorry for the too quick post. In rearranging things I lost the spam return.
Just to be sure it was actually filtering something, I searched for "sex" at
Google. It let that page in, allowed the safersex and SEX.ETC pages
through, but blocked HBO's Sex and the City and janesguide. Note that this
is using my current hammmie.db file, which has only been trained on my ham
and spam email collections. I don't expect it to necessarily do a very good
job with web pages given no training.
Skip
import os
from proxy3_filter import *
import proxy3_options
from spambayes import hammie, Options, mboxutils
dbf = os.path.expanduser(Options.options.hammiefilter_persistent_storage_file)
class SpambayesFilter(BufferAllFilter):
hammie = hammie.open(dbf, 1, 'r')
def filter(self, s):
if self.reply.split()[1] == '200':
prob = self.hammie.score("%s\r\n%s" % (self.serverheaders, s))
print "| prob: %.5f" % prob
if prob >= Options.options.spam_cutoff:
print self.serverheaders
print "text:", s[0:40], "...", s[-40:]
return "not authorized"
return s
from proxy3_util import *
register_filter('*/*', 'text/html', SpambayesFilter)
More information about the Python-list
mailing list