[Spambayes] Outlook weirdness
Rob Hooft
rob@hooft.net
Wed Nov 20 20:05:11 2002
Moore, Paul wrote:
> From: Sean True [mailto:seant@iname.com]
>
>>>Would it be worth trying the DBM format for the database? I think
>>>this would give faster startup/shutdown times, and lower memory
>>>consumption, at the expense of on-disk database size and slower
>>>filtering (although I doubt that this difference would be an issue).
>>>
>>
>>Slower *training* would be an issue, however.
>
>
> I can't imagine the training getting much slower than it is at the
> moment for me :-( The pickle isn't being dumped to disk when I hit
> "Delete as spam", but the operation is taking over a second. No
> idea why...
Isn't that the update_spamprob? It is updating ~300k spam probabilities,
where you are going to use only a few every time. The current Bayes is
optimized for training on hundreds of messages at a time, and then
scoring hundreds. For "training one, scoring one" it would be more
efficient to delay the calculation of the spam probs until they are needed.
Rob
--
Rob W.W. Hooft || rob@hooft.net || http://www.hooft.net/people/rob/
More information about the Spambayes
mailing list