[Spambayes] Effectiveness

Tim Peters tim.one at comcast.net
Thu Oct 2 21:24:19 EDT 2003


[Jackie Lan Manlosa]
> I know that Spambayes uses bayesian filtering...

Not in the sense that most people mean it, but it does have a Bayesian
component.

> Now, I am asking would it still be effective if I am for example
> receiving 4 millions mails a day?

I don't know.  What if you received 4 million emails a day and *didn't* use
SpamBayes?  Would you be able to handle that load?  That's a lot of email.

> Would it not tire my CPU, coz as far as I know it, Bayesian
> Algorythm is a CPU intensive Algorythm

We spend more time tokenizing the message, and doing I/O, than doing the
spam-score calculation.  4 million emails a day is probably more than 4
gigabytes of data a day, and any algorithm will consume non-trivial
resources.  SpamBayes is on the efficient end of the scale for spam filters.
Of course it would be cheaper to just store the incoming bytes without
looking at them at all.




More information about the Spambayes mailing list