[Spambayes] How many is enough?

Anthony Baxter anthony at interlink.com.au
Mon May 12 13:03:50 EDT 2003



> The more you train, the better, in general.  However:
>   * If you have many more of ham/spam than spam/ham, this can be bad.
>     (however, if you enable the experimental_ham_spam_imbalance option,
>     this shouldn't matter as much, although it hasn't been tested as
>     much as it could be).
>   * 50 spam is fairly low.  It wouldn't be that surprising to get some
>     incorrect results with that few, but it should still do a reasonable
>     job.

Early experiments (see the mailing list archive) found that when you
go from thousands to multiple tens of thousands of messages, accuracy
gets a little bit worse. I'm going completely from memory here, you'd
have to check the mailing list archive for the details. 

Anthony



More information about the Spambayes mailing list