[spambayes-dev] Stats are very slow

Kenny Pitt kenny.pitt at gmail.com
Thu Dec 28 16:45:30 CET 2006


On 12/20/06, Kenny Pitt <kenny.pitt at gmail.com> wrote:
> On 12/20/06, skip at pobox.com <skip at pobox.com> wrote:
> >
> >    Kenny> Of course, this only solves part of the problem because we would
> >    Kenny> still take a huge hit when displaying the statistics. It might be
> >    Kenny> worth considering an optimization to store the actual statistics
> >    Kenny> values instead of calculating them at the start of every Outlook
> >    Kenny> session.
> >
> > That occurred to me after my reply.  I suspect it's probably the way to go.
>
> I checked in an initial update to delay the calculation of the
> persistent stats until the GetStats() call because that was the easy
> update. In the case where you never actually view the stats in
> SpamBayes Manager, this should help. Let me know if you see any
> oddities in the stats calculation after this.
>
> The complete fix is a little more involved, so I'll have to defer that
> until I have more time to test it thoroughly.

I just checked in an update to add permanent caching of the
statistics. With an existing message info db that doesn't yet contain
the cached statistics, you'll have the old startup delay one time to
recalculate the missing statistics. After that, the statistics should
be reloaded directly from the cache record in the message info db and
startup will be much faster.

There is a minimal performance hit on each message classification
because I have to update the statistics in the db every time to keep
them in sync. I think this will be pretty much unnoticeable in the
grand scheme of things, but let me know if you find otherwise.

-- 
Kenny Pitt


More information about the spambayes-dev mailing list