[Spambayes] Upgrade problem

Tim Peters tim.one@comcast.net
Fri Nov 8 05:06:43 2002

[Richie Hindle]
> A quick note in case someone decides to remove the counts from the
> database:

Neil Schemenauer already does, in his CDB code (neil*.py).  It's a lean
scoring-only database, mapping tokens to *just* spamprobs.  If he went on to
store them as scaled ints, he could almost certainly reduce this to 2 bytes
of prob info per token, and possibly even just 1.

> the HTML front end has a "Word query" feature which will tell you the
> information in the database for a given word - it's interesting to see
> how many more times the word 'Viagra' appears in ham than in spam.  I
> mean the other way round.

What a geek <wink>.

More information about the Spambayes mailing list