[spambayes-bugs] [ spambayes-Feature Requests-765924 ] Spam / ham statistics

SourceForge.net noreply at sourceforge.net
Thu Nov 20 09:21:23 EST 2003


Feature Requests item #765924, was opened at 2003-07-04 08:51
Message generated for change (Comment added) made by kpitt
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=498106&aid=765924&group_id=61702

Category: Outlook
Group: None
Status: Open
Priority: 5
Submitted By: Magnus Aycox (mbip)
Assigned to: Mark Hammond (mhammond)
Summary: Spam / ham statistics

Initial Comment:
Possibility to get statistics on how many mails were 
received per hour / day and how many of these were 
spam messages.
It would be great if it could be presented both as 
numbers and graphically (impresses CEO's...). The 
means to print it as a hard copy would be just 
swell... ;o)


----------------------------------------------------------------------

Comment By: Kenny Pitt (kpitt)
Date: 2003-11-20 09:21

Message:
Logged In: YES 
user_id=859086

The latest CVS plugin actually does include false positives and 
negatives in the statistics.  The definition is this.  If a 
message was classified as Ham and then reclassified by the 
user as Spam, it is a false negative.  If a message was 
classified as Spam and then reclassified by the user as Ham, 
it is a false positive.  A message that was originally classified 
as Unsure is never considered a false positive or negative.

----------------------------------------------------------------------

Comment By: Erik Sargent (esargent)
Date: 2003-11-20 04:34

Message:
Logged In: YES 
user_id=586922

Quick note on the "impossibility" of tracking false pos/neg. 

Actually, since an incorrectly classified message already has 
a header inserted, then you would only flag a "false" if that 
header existed and was changed. This means you'd have to 
check for the existence of the header before you processed 
the Delete/Recover buttons, but it can be done.

----------------------------------------------------------------------

Comment By: Tony Meyer (anadelonbrin)
Date: 2003-09-30 00:19

Message:
Logged In: YES 
user_id=552329

Note that the plug-in has basic stats information (cvs 
version) now, although it's still only on a per session basis 
(this will no doubt improve at some point).

The web interface (for non-plugin users) also now (cvs head) 
has basic stats, which are persisted between sessions.

Any opinions on which statistics would be best to add?

----------------------------------------------------------------------

Comment By: Tony Meyer (anadelonbrin)
Date: 2003-07-11 00:25

Message:
Logged In: YES 
user_id=552329

It's not exactly what you have asked for, but as a start, are 
you aware that in the logs, each time you shut Outlook down 
it prints a message like:
"SpamBayes processed 555 messages, finding 34 spam and 11 
unsure"

(So you could shut Outlook down each hour/day, to generate 
this message).  It's unlikely that a graphical version would 
ever be made, but it would be easy enough to throw numbers 
like this into Excel and get pretty graphs.

The number of false positives/negatives is more difficult 
because SpamBayes doesn't really have any way to know 
that mail is a fp/fn.  It could print the number of times 
the "delete as spam" and "recover from spam" buttons are 
used, I guess, but this would include all unsure mail, which 
aren't exactly fp/fn's.

----------------------------------------------------------------------

Comment By: Mark Jeays (dze27)
Date: 2003-07-11 00:16

Message:
Logged In: YES 
user_id=302748

I'm just another user but I think this would be a great
addition. I'd also be interested in: number of false
positives (along with percentage of total), number of false
negatives (along with percentage of total) and percentage of
mail received that is spam.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=498106&aid=765924&group_id=61702



More information about the Spambayes-bugs mailing list