[Spambayes] Client/Server filtering model? Anyone have code?

Christopher Jastram cej at intech.com
Thu Nov 6 20:54:24 EST 2003


Currently, I have a script that runs through every user's spool 
directory and;
Everything in "Spam" or "Junk" is read as spam
Things like the main inbox, sent, drafts and trash are discarded
Everything else is read as ham.
This script takes a significant amount of time to run (along the order 
of 30-40 minutes), is run on a nightly basis, and only processes a few 
users (just a couple test users -- once everyone switches from POP to 
IMAP, there'll be hell to pay unless I can figure out a faster way to do 
this.)  The system load also jumped from 0.1 - 0.2 to a steady 1.6 - 1.8 
with the addition of spam filtering (we get a lot of spam).

An idea I had -- would it be possible to have *one* (multithreaded?) 
constantly running python server that reads mail, evaluates it (addes a 
classified or trained header) and passes it back?  The advantage is that 
the database would not be re-read on every invokation.  Another 
advantage might be in the outsourcing of the processing to another 
machine.  Just an idea.  I'll work on something like this myself when I 
get a roundtoit (always short on time), but I'm wondering if anyone has 
prototype code...?  (Is this even a good idea?)

Thanks,

chris

P.S. on the 'lot of spam' note: most of the mail comes to 10 years worth 
of employees who longer work here.  We used to bounce it, but the 
resulting mail queue and processing time took our mail server to its 
knees.  Repeatedly.  Both Exchange and Postfix/Cyrus.  Now there's a 
dead-letter box and a 3-day hold period instead of the default 5, which 
brings it down to manageable (sorta) levels.  Anyone have any additional 
ideas or guides to using Postfix to drop blatantly spammy email?  To 
give you a sense of what I'm looking for, here's an example; I cut 
incoming spam by a good 10 percent, just by dropping all messages coming 
from outside, but claiming to be from within our domain.  Surely there's 
more such rules that I haven't thought of?





More information about the Spambayes mailing list