[Spambayes] spam bayes db not changing

skip at pobox.com skip at pobox.com
Thu Aug 3 16:34:19 CEST 2006


    Dhaval> I have been keeping an eye on the db file and notice that after
    Dhaval> a training, the timestamp is updated but the filesize is the
    Dhaval> same. Is this normal?

Yes.  The database file isn't a simple text file.  There are lots of "holes"
in the file to tuck new tokens, and for existing tokens all that happens
most of the time is that the count for the token increases.

    Dhaval> Does anybody have any advice on what to look for?

Try running sb_dbexpimp.py before and after training, then compare the
output:

    sb_dbexpimp.py -e -f bayes1.csv
    sb_mboxtrain.py ...
    sb_dbexpimp.py -e -f bayes2.csv
    diff -u bayes1.csv bayes2.csv | more

Skip


More information about the SpamBayes mailing list