[Spambayes] How low can you go?

Gerrit Holl gerrit at nl.linux.org
Mon Dec 15 13:32:57 EST 2003


> > So, how small is yours? <wink>

I started with a minimal database, each time reclassifying my unsure
folder. I found out that with less than 15 ham + 16 spam, it doesn't
work good enough. Note that the spam I receive is very monotonous,
because my ISP replaces viruses with text messages, and since that's
almost all the spam I receive, 4 spams are enough to get all spam with a
probability op more dan 90%. However, with 15 hams, some ham scores
above 20%. And because I don't want to unbalance the database, I trained
on already-correctly-classified spams as well as the most highly
unsures.

With 15 ham, 16 spam, 1.5% of the incoming e-mail is classified as
unsure, all ham, with scores ranging from .108 to .290, so a ham_cutoff
of 30% would solve it all.

Gerrit.

-- 
110. If a "sister of a god" open a tavern, or enter a tavern to drink,
then shall this woman be burned to death.
          -- 1780 BC, Hammurabi, Code of Law
-- 
Asperger Syndroom - een persoonlijke benadering:
	http://people.nl.linux.org/~gerrit/
Kom in verzet tegen dit kabinet:
	http://www.sp.nl/



More information about the Spambayes mailing list