[Spambayes] Run filter and only return a report???

Gregory Gulik greg at gulik.org
Thu Feb 10 22:55:20 CET 2005



Tony Meyer wrote:
> What do you mean by detailed report?  Would this be all the tokens in the
> message, or all the tokens that would be used in scoring?  For each token,
> would you want just the token, or also the ham/spam counts or the scores, or
> all of these?  Would they have to be in a particular format?

I'd probably only be interested in the tokens that were used in scoring 
and the output just needs to be in an easily parseable format.

> What do you mean by without any other processing?  sb_filter (when
> filtering) doesn't really do much apart from generate the tokens & scores.

Right, just give me a score, don't make any changes to the database or 
attempt to deliver the message.

> Depending on the answers to those, you might be able to get sb_filter to do
> what you want.  If you can't, then it would be trivial to create a small
> script that did what you wanted (I could throw one together for you if you
> like).

Thanks.  I need to add Python to the list of programming languages I know.

Basically, a friend who's company uses SpamBayes with the Outlook 
plug-in sent me a report he saw, here is a summary:

Combined Score: 100% (0.999998)
Internal ham score (*H*): 4.79832e-006
Internal spam score (*S*): 1

# ham trained on: 89
# spam trained on: 1733
28 Significant Tokens

token                               spamprob         #ham  #spam
'x-mailer:microsoft office outlook, build 11.0.6353' 0.168914 
  2      7
'url:org'                           0.254701           13     86
'url:rec-html40'                    0.277582            3     22
'skip:r 10'                         0.284156           28    216
'skip:p 10'                         0.321735           31    286
'url:tr'                            0.372452            4     46
'url:www'                           0.384768           63    767
'virus:src="cid:'                   0.72041             3    151
'from:addr:level3.net'              0.844828            0      1
'subject:\xe4'                      0.844828            0      1
	.
	.

That's basically the kind of report I would like to see.

-- 
Greg Gulik                                 http://www.gulik.org/greg/
greg @ gulik.org



More information about the Spambayes mailing list