[spambayes-dev] RE: [Spambayes] How low can you go?

T. Alexander Popiel popiel at wolfskeep.com
Mon Dec 22 16:09:18 EST 2003


In message:  <MHEGIFHMACFNNIMMBACAAEICGPAA.sethg at GoodmanAssociates.com>
             "Seth Goodman" <sethg at GoodmanAssociates.com> writes:
>
>I would like to investigate whole message expiration with different training
>and expiration schemes.

Ah, in that case, definitely look at the incremental framework that
I built.  I have various training regimes that do train-on-everything
vs. mistake-only, as well as one which expires stuff based on time.
Making more regimes to do various other things should be very easy.

>From our previous discussion, it seems that the most flexible way to
>approach this is by going to a system with the several bidirectional
>maps implemented in the databases:  feature_id <-> token, msg_id (+
>training timestamp) <-> feature_id  and token database w/training
>timestamp per entry.  Instead of training timestamp, expiration time
>might be preferable.

Definite overkill.  Most of this won't be needed for any given
regime, and will instead just bloat the transient data requirements
during testing.  Just make each regime keep track of the data it
needs to do whatever it wants to do.

- Alex



More information about the spambayes-dev mailing list