[Spambayes] There Can Be Only One
Anthony Baxter
anthony@interlink.com.au
Fri, 27 Sep 2002 00:05:56 +1000
>>> Greg Ward wrote
> Anyways, that little exploration has me wondering just how valid my data
> is. I should probably rerun everything without looking at "Received"
> headers at all (except to count them -- for the most part, they stop at
> either mail.python.org or starship.python.net, which are the front-line
> servers for these two collections).
The data's fine, you'll just need to be careful about the headers you
look at - it's distressing _how_ good this stuff is at spotting patterns.
Delivery-date is another header to watch out for, if the ham/spam comes
from different places or times. It's not clear to me what puts that
header in - it might be an MH thing.
I tend to do multiple runs by having multiple ini files. Say, a
common.ini with the options that are constant, then test1.ini, test2.ini
or whatever, that have the options that vary. I can then put
BAYESCUSTOMIZE="common.ini test1.ini" python2.3 timcv.py ...... > test1.txt
BAYESCUSTOMIZE="common.ini test2.ini" python2.3 timcv.py ...... > test2.txt
BAYESCUSTOMIZE="common.ini test3.ini" python2.3 timcv.py ...... > test3.txt
in a shell script, run it and go get lunch (or coffee, or sleep, or
whatever)
(aside - I tend to name the changing ini files more like x01.ini x02.ini
mindisc15.ini, makes life more sane...)
Anthony