[Spambayes] X-Hammie-Disposition split suggestion

Tim Peters tim.one@comcast.net
Thu Oct 31 02:33:37 2002


[Skip Montanaro]
> The X-Hammie-Disposition header contains multiple bits of
> information.  I'm not sure what the *H* and *S* chunks are for
> (overall hammieness?),

chi-combining computes two scores internally, one for ham-ness (H) and the
other for spam-ness (S).  That's what *H* and *S* tell you.  The final score
is (S-H+1)/2.

> but I think it would be worthwhile to put the individual word
> probabilities in a separate header.

Or drop them altogether.  Geeks may find this stuff morbidly interesting,
and spambayes developers need to see this stuff when a msg gets a surprising
score, but I doubt anyone else has any earthly use for it.  It's also a bit
like giving away pieces of your private key in public-key cryptosystem:
"well, Mister Spammer, you can't guess what's spam and ham to me without
breaking into my database, but here are the 150 best & worst guesses you
made, along with exactly how good they were".

> That way, I could tell my mailer to display the much smaller
> X-Hammie-Disposition header and suppress display of the (for
> example) X-Hammie-Word-Probabilities header by default, e.g.:
>
>     X-Hammie-Disposition: Yes; 1.00; '*H*': 0.00; '*S*': 1.00

I suggest dropping the *H* and *S* here too.  In the Outlook client, we've
also switched to feeding the end user int(round(score * 100.0)), i.e. an
integer in 0 .. 100 inclusive.  There's really no need to bother pretty
users' heads with the mysteries of floating point <wink>.

>     X-Hammie-Word-Probabilities:'rbl':0.07; 'script':0.07; 'to:2**1':0.09;
>         'osirusoft':0.10; 'url:org':0.15; 'subject:; ':0.15; 'cgi':0.20;
>         'sorry':0.22; 'mailing':0.23; 'list:':0.24; 'skip:" 10':0.27;
>         'skip:r 20':0.28; 'subject:SPAM':0.30; 'called':0.31; 'body':0.33;
>         'rcvd_in_dsbl':0.34; 'open':0.35; 'being':0.35; 'version':0.36;
>         'from:':0.36; 'skip:u 10':0.37; ...
>
> If something in the X-Hammie-Disposition header jumps out at you, you can
> display all the message's headers.
>
> Make sense?  If so, I'll be happy to modify hammie.py.

I'm not a hammie user, but I know my sisters.  That leaves me more neutral
than I may sound, as one of my sisters doubtless has no idea "headers"
exist.  She pays to download them, though!