[Spambayes] How to Display tokenized ham/spam scores?

Meyer, Tony T.A.Meyer at massey.ac.nz
Wed Aug 20 16:43:42 EDT 2003


> I used the following prob formula:
> prob = spamratio / (hamratio + spamratio)
>   where: (spamratio = spamcount / nspam;   hamratio = hamcount / nham)

The rest of it (this is the Bayesian bit, from what I understand) is:
        prob = (StimesX + n * prob) / (S + n)
  where n = hamcount + spamcount (excluding the possible imbalance
adjustment)
        S is the strength of a word that the classifier hasn't come
across before (default 0.45)
        StimeX is S times the (estimated) probability that the
classifier hasn't seen a word (default 0.5)

Once you have all the probabilities, they are then combined into a
message score using either the chi squared function or the 'gary
combining' function.

=Tony Meyer



More information about the Spambayes mailing list