[spambayes-dev] training from IMAP folder? (with patch)

Jason Smith jhs at oes.co.th
Sun Jan 18 03:05:38 EST 2004


Hello.  I guess I'll explain in reverse chronological order, so you can stop 
reading when you get bored.

This is a patch to allow sb_mboxtrain.py to train from an IMAP folder, similar 
to its behavior when training from Maildir.  I have tested it on Linux 
against courier-imapd (Debian Woody) and cyrus-imapd-2.3.2 compiled from 
source.  Currently, it only supports plain-text login.  It sould be 
considered somewhat quick-and-dirty, as I read the RFC and implemented it 
with imaplib in just one evening.  Unfortunately, it is against the latest 
release and not CVS because I cannot seem to access SF CVS at the moment.

The reason I need this feature (as opposed to the IMAP filter) is to implement 
server-side spam filtering (using cyrus) and training which is intuitive for 
lay mail users.  For the record, cyrus is a mail server which isolates the 
physical mail data from system users (i.e. the only access to mail is via 
IMAP/POP, unlike courier/maildir).  Many deployments do not even have Unix 
user accounts on the machine.

I have successfully integrated spambayes as an incoming filter using procmail 
much like the documentation.  To train, users just need to drag missed spam 
to INBOX.Spam, and drag good messages to e.g. INBOX.Read.  I want to run a 
cron job nightly to go through each user and train against their personal 
database.  However, Cyrus uses a custom organization system for speed, and 
it's not decent to go mucking around /var/spool/cyrus by hand.  Looks like 
the most effective way to do this is to access the mail through localhost 
IMAP.

If this looks okay (or if somebody can suggest a better method), I am 
interested in making Debian packages as well as an implementation HOWTO as 
part of the new UserLinux project, since I will be rolling out a fairly large 
email site later this year.

-- 
Jason Smith
Open Enterprise Systems
Bangkok, Thailand
http://www.oes.co.th
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mboxtrain-IMAP.diff
Type: text/x-diff
Size: 4537 bytes
Desc: not available
Url : http://mail.python.org/pipermail/spambayes-dev/attachments/20040118/b403db29/mboxtrain-IMAP.bin


More information about the spambayes-dev mailing list