[Spambayes] Web interface statistics
David Abrahams
dave at boost-consulting.com
Thu May 10 15:55:31 CEST 2007
on Thu May 10 2007, skip-AT-pobox.com wrote:
> Dave> There really is something very fishy going on. I actually added
> Dave> instrumentation code to watch my training script train particular
> Dave> words multiple times as ham or spam, but when I query those words
> Dave> using the sb_imapfilter web interface, they always are shown as
> Dave> having been trained 0 or 1 times, with one of two corresponding
> Dave> probabilities.
>
> Dave> I do a wildcard query with a single letter and returning 1000
> Dave> results, and there's not a single number over 1 in the #spam or
> Dave> #ham columns.
>
> Dave> What could be going on?
>
> I've no idea. It seems to be working for me. I have lots of singletons(*),
> which is to be expected, but also lots of multiples:
OK, a couple of questions:
1. what kind of database are you using? Maybe this is something in
the DBM handling?
2. have you tried my patchset yet? I'd like to know if it's somehow a
bug I introduced.
> (*) Linguists call such singletons "hapax legemona". I guess they were
> trying to be snooty when they came up with that term.
Oh, they weren't just _trying_ ;-)
--
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com
Don't Miss BoostCon 2007! ==> http://www.boostcon.com
More information about the SpamBayes
mailing list