[Spambayes] Outlook weirdness

Rob Hooft rob@hooft.net
Wed Nov 20 20:05:11 2002


Moore, Paul wrote:
> From: Sean True [mailto:seant@iname.com]
> 
>>>Would it be worth trying the DBM format for the database? I think
>>>this would give faster startup/shutdown times, and lower memory
>>>consumption, at the expense of on-disk database size and slower
>>>filtering (although I doubt that this difference would be an issue).
>>>
>>
>>Slower *training* would be an issue, however.
> 
> 
> I can't imagine the training getting much slower than it is at the
> moment for me :-( The pickle isn't being dumped to disk when I hit
> "Delete as spam", but the operation is taking over a second. No
> idea why...

Isn't that the update_spamprob? It is updating ~300k spam probabilities, 
where you are going to use only a few every time. The current Bayes is 
optimized for training on hundreds of messages at a time, and then 
scoring hundreds. For "training one, scoring one" it would be more 
efficient to delay the calculation of the spam probs until they are needed.

Rob


-- 
Rob W.W. Hooft  ||  rob@hooft.net  ||  http://www.hooft.net/people/rob/




More information about the Spambayes mailing list