[Spambayes] How to Display tokenized ham/spam scores?
Meyer, Tony
T.A.Meyer at massey.ac.nz
Wed Aug 20 16:43:42 EDT 2003
> I used the following prob formula:
> prob = spamratio / (hamratio + spamratio)
> where: (spamratio = spamcount / nspam; hamratio = hamcount / nham)
The rest of it (this is the Bayesian bit, from what I understand) is:
prob = (StimesX + n * prob) / (S + n)
where n = hamcount + spamcount (excluding the possible imbalance
adjustment)
S is the strength of a word that the classifier hasn't come
across before (default 0.45)
StimeX is S times the (estimated) probability that the
classifier hasn't seen a word (default 0.5)
Once you have all the probabilities, they are then combined into a
message score using either the chi squared function or the 'gary
combining' function.
=Tony Meyer
More information about the Spambayes
mailing list