Spambayes + HTTP proxy server
Skip Montanaro
skip at pobox.com
Sun Feb 2 13:17:23 EST 2003
>> Perfectly workable, though it would probably require some tweaks to
>> the tokenizer to work as well as possible.
...
Paul> The prototype turned out to be shorter than my original post,
...
This doesn't quite work right. (Nor does the similar version I posted
earlier.) The .filter() method gets passed chunks of an HTML response, not
the entire thing. The SpamBayesFilter class should subclass
BufferAllFilter. Here's a tweaked version of mine which does a better job:
import os
from proxy3_filter import *
import proxy3_options
from spambayes import hammie, Options, mboxutils
dbf = os.path.expanduser(Options.options.hammiefilter_persistent_storage_file)
class SpambayesFilter(BufferAllFilter):
hammie = hammie.open(dbf, 1, 'r')
def filter(self, s):
if self.reply.split()[1] == '200':
prob = self.hammie.score("%s\r\n%s" % (self.serverheaders, s))
print "| prob: %.5f" % prob
if prob >= Options.options.spam_cutoff:
print self.serverheaders
print "text:", s[0:40], "...", s[-40:]
return s
from proxy3_util import *
register_filter('*/*', 'text/html', SpambayesFilter)
Skip
More information about the Python-list
mailing list