[spambayes-dev] A bit confused about ZODB changes

Sun Apr 23 13:29:55 CEST 2006

I finally cvs up'd last night, installed ZODB and tried things out.  My
tte.py script (in contrib) is still selecting BerkDB instead of ZODB.
Looking at things, I see that it uses storage.database_type to determine the
database type and name.  My storage options are

    [Storage]
    persistent_storage_file: ~/hammie.db

I run my tte.py script like so:

    .../tte.py -d ~/hammie.db ...

so storage.database_type is called like so:

    storage.database_type([('-d', '/Users/skip/hammie.db')],
                          default_type="ZODB", default_name="~hammie.db")

The _storage_options dictionary still says that -d means "dbm".  Shouldn't
it say "zodb", since that's the new default?  After making that change
locally, it now dumps a ZODB database.)  Alternatively, should I even be
using storage.database_type?  I need to use the -d flag because I write the
database into a different spot then mv it into place so as to avoid problems
with simultaneous reads and writes during database generation.  If I'm using
ZODB do I need to mv more than just one file into place?  I see that the
process generated .index, .lock and .tmp files as well.

Finally, I don't understand how I'm supposed to get the spam and ham counts
from a ZODB database.  My spamcounts.py script (see contrib dir) was making
assumptions about the structure of the database, assuming it could directly
access the keys of a dbm or dict (pickle).  Any thoughts about how to clean
that up?  I think I should be calling db.spamprob(word), but I still don't
know how to get the raw spam/ham counts that script wants to print.

Thx,

Skip