[Spambayes] Concurrent DB access leads to corruption? (FAQ question)

Jim Correia jim.correia at gmail.com
Sun Apr 3 01:10:44 CEST 2005


On Apr 2, 2005 10:25 AM, Skip Montanaro <skip at pobox.com> wrote:

> Yes, I believe it still is.  Most SpamBayes usage scenarios involve just one
> of the applications, so there's been no crying need to add locking so they
> can operate concurrently.

We'll I'm definitely a SpamBayes newbie, so maybe I am going about it
the wrong way. In my bogofilter setup I filtered may via procmail, and
retrained using a custom python script that talked to the imap server.
I was trying to replicate this process in SpamBayes (using
sb_filter.py from procmail, and sb_imapfilter.py for retraining),
partly because it is familiar, partly because reading through the
sb_imapfilter source lead me to believe it would be inefficient for
classification of my mail since it walks the inbox completely every
time, and I tend to leave a couple of thousand messages in there. (Of
course, I could have read the source incorrectly, and even if I have,
I should measure to see how slow it is or isn't.)

Is there a better way to deal with my setup? I'll do more digging in
the documentation.


> I think it would be fairly difficult to implement anyway.  Some apps,
> like sb_filter, only keep the db open briefly.  Others, like
> sb_imapfilter and sb_bnfilter (via it's background server) want to keep
> the database open for long periods.  An application waiting on one of
> these applications to release the lock would probably need some way to
> signal the long-running app to close the database.

I see. If that is what is going on (I'm not yet familiar with the
code) it sounds like it would require some rearchitecting at the very
least.

> Not that I'm aware of.  Feel free to implement something and toss it back
> over the wall. ;-)

Understood.

Thanks,
Jim


More information about the Spambayes mailing list