[spambayes-dev] Enhanced Outlook statistics display

Thu Dec 9 05:01:31 CET 2004

> I'm still a little unsure (ok, pun intended, couldn't resist 
> <wink>) how to treat unsures in this. Currently I'm showing 
> the primary accuracy results based on the number of messages 
> that SpamBayes classified as either ham or spam, with a 
> separate percentage showing the additional messages that were 
> classified as unsure.

This works for me.  For filters like SpamBayes that do h/s/u rather than
just h/s I can't see any way to meaningfully compare results unless some
weighting selected (which makes it reasonably arbitrary).  I think we just
have to have four stats.

I've wondered about putting up the spam "cost" as calculated by the various
testtools scripts (by default 10*fp+fn+0.2*unsure).  It does give a single
figure for accuracy that takes into consideration how bad fp's are and works
with unsures - and you could use it with filters that don't have an unsure
category.  The weights are adjustable via options, although few people
would.

(As an aside: if I had the time for a little research project (which I won't
for at least another year), it would be interesting to examine how people do
actually weigh fp/fn/unsures (eg *10 is probably low for fp), and then there
would be a justifiable way to give a single number measure.  I'm sure a nice
little paper could be written on this.  If anyone else reading this is keen
on doing the research, let me know and maybe I do have time <wink>).

> Another option I considered was measuring the percentage of 
> messages removed from the inbox. It seems that ham and spam 
> are somewhat asymmetric with regards to unsures. I suspect 
> that most people are ok with spam being classified as unsure 
> as long as it isn't left in their inbox, but they would 
> prefer not to see a ham message removed from the inbox even 
> if it is only moved to the unsure bin.

I'm not brave enough to predict what people think, but I know that I'm fine
with ham going to unsures.  I don't really care what the mix is there, as
long as it's reasonably small (~2% is ok with me).  It does seem (from the
spambayes at python.org feedback) that unsure boxes do tend to be mostly spam,
though.

> Any thoughts/suggestions/preferences?

I'm fine with the stats that we have now (what I would like, and might get
to at some point, is to centralise the stats code somewhat so that we don't
have to keep updating both the web interface code and Outlook separately).

What do you think about the stats that are requested in the tracker?

Another thought I had was that we could fit a "Reset Statistics" button on
the Statistics panel (all it would have to do is delete the pickle and reset
the session stats).  People might want to collect (eg) monthly stats, or
stats after an initial training period, and that would make it easier for
them.  I hate mucking about with the dialogs - you want to do this? ;)

=Tony.Meyer