[Spambayes] ham/spam show 'n tell

Atom 'Smasher' atom at suspicious.org
Wed Dec 10 00:08:11 EST 2003


this is an interesting (to me) observation.... at this moment i have my
database made up of 524 hams and 522 spams, of which, the 5 hammiest spams
score:
	0.310988054593
	0.477222956608
	0.69525821016
	0.778912509175
	0.882964949455

and the 5 spammiest hams score:
	0.00282491682801
	0.00295767979157
	0.00548257011708
	0.00566510201445
	0.00933374699939

right now my spam-cutoff is 0.8, and looking at these numbers even that
seems conservative.

so, what do these numbers look like with databases made from different
sized pools of ham & spam? how about with a database made of 34 emails...
skip?  this might give some quantifiable clues about how big a database is
"big enough".


        ...atom

 _______________________________________________
 PGP key - http://smasher.suspicious.org/pgp.txt
 3EBE 2810 30AE 601D 54B2 4A90 9C28 0BBF 3D7D 41E3
 -------------------------------------------------

	"The thing that bugs me is that the people think the
	 FDA (Food and Drug Administration) is protecting them.
	 It isn't. What the FDA is doing and what the public
	 thinks it's doing are as different as night and day."
		-- Dr Ley, former Commissioner of the FDA




More information about the Spambayes mailing list