[Spambayes] How to tell SpamBayes to check more headers (using MSOplugin)

Sun Nov 23 21:07:39 EST 2003

Tony Meyer wrote:

> However, if these headers appear for both ham and spam, and it's just the
> contents of the headers that varies, then this isn't going to do you much
> good.  In that case, you'd have to add code to tokenizer.py (in the
> tokenize_headers function) to add specific tokens for SpamAssassin and
> SpamCop. (If you do, please submit a patch).

I'm still playing with it, but here's what I have so far. It seems to work
quite nicely - after a full retrain, I'm seeing a lot of spammassassin: and
spamcop: lines near the spammy end of evidence listings.

*** WARNING *** PYTHON NEWBIE *** WARNING ***

spamassassin_re = re.compile(r'tests=([A-Z0-9,_]+)')
...
        # X-Spam-Status:
        # Added by SpamAssassin (http://www.spamassassin.org)
        line = msg.get('x-spam-status')
        if line is not None:
            line = ''.join(line.split())
            for rules in spamassassin_re.findall(line):
                for rule in rules.split(','):
                    yield 'spamassassin:' + rule

        # X-SpamCop-Disposition:
        # Added by SpamCop Mail service (http://www.spamcop.net)
        line = msg.get('x-spamcop-disposition')
        if line is not None:
            for token in line.lower().split():
                yield 'spamcop:' + token

-- Mat.