[Spambayes] Using mxBeeBase as hammie DB

Tim Peters tim.one@comcast.net
Thu Oct 17 22:03:30 2002


[Tim]
>> Pruning the database, and especially over time, is something that
>> needs work here.

[M.-A. Lemburg]
> Is there some way to do this automagically ?

No; that's part of what "needs work here" means.  In addition, some fields
in the WordInfo records probably aren't needed, or at best are too big (like
saving an 8-byte double for a timestamp).  It's also unknown how pruning
will affect accuracy over time, esp. since training is done on a

    batch of words per msg

basis, but unless the tokenstream for each msg is saved, expiring words from
the database will yield a state that doesn't match any real-life combination
of training msgs.

Feel free to solve all that in your spare time <wink>.