[spambayes-dev] Re: Pickle vs DB inconsistencies

Greg Ward gward at python.net
Mon Jul 7 22:59:33 EDT 2003


On 07 July 2003, Meyer, Tony said:
> Which scripts don't use the "-[d|D] filename" format?  They should all
> work with setting [Storage]persistent_storage_file in a config file, as
> well.

Well, they all seem to use -d/-D, but differently.  Eg.

  hammie.py -d -p spambayes.db

means open a DBM store (what does "-p" have to do with "name of database
file?" -- I know, I know, hysterical raisons, but it's still annoying),
whereas

  hammie.py -D -p spambayes.pkl

opens a pickle.  But with dbExpImp.py, -d and -D are reversed in
meaning, and there's no -p option:

  dbExpImp -D spambayes.db
  dbExpImp -d spambayes.pkl

A third variation:

  hammiefilter.py -d spambayes.db
  hammiefilter.py -D spambayes.pkl

This is all intensely annoying.  One clear, obvious, sensible way to do
it:

  <script> -d spambayes.db    # load a DBM store
  <script> -p spambayes.pkl   # load a pickle store

Also, the help for those three scripts (the only ones I've looked at for
this rant) are formatted inconsistently.  You might not like Optik's
help formatting (I'm not always sure I do), but at least it's the same
for every script!  And thanks to David Goodger, you can override it.

The script names are also terrible.  hammie.py might have made a cute
sort of sense nine months ago when this was Tim's play project, but now
it's sitting in /usr/local/bin and what the *hell* does "hammie" mean?
Nothing!  Likewise, dbExpImp could apply to any application that has a
database.

> > IMHO there should be one script for each 
> > of the following tasks:
> [...]
> >   * export a database
> >   * import a database
> 
> Any particular reason that these should be separated into two scripts?
> (Rather than having one script to "convert" a database?)

Because they're two separate tasks.  Merging them into one command means
you need some way to say whether you want to export or import -- that's
a modal option, and I'm very leery of modal options, because generally
they're required options, and "required option" is an oxymoron if I ever
heard one.  See http://www.gerg.ca/software/optik/tao.html for more
details.

(Note that I said "one command", not "one script".  From a Unix
perspective, it would make perfect sense to have one script, and make
sb-export and sb-import both symlinks to it.  See gzip/gunzip for the
canonical example.  [Hmmm, actually they're hardlinked on my system, but
whatever.]  Dunno how Windows people deal with that sort of thing.)

> They do all live in one directory (apart from the testing tools) - the
> root, and they are all installed into the scripts directory by setup.py,
> if you run that.  What would be the advantage of moving *.py in the root
> directory into a scripts directory?

Hmmm, I guess I got confused by hammie.py and spambayes/hammie.py.  One
file with a content-free name is bad enough -- why repeat it?!  ;-( Come
to think of it, hammiesrv.py, hammiefilter.py, and hammiecli.py are
almost as bad.

> Don't they all use getopt?  If they used optparse, then the required
> Python version would lift to 2.3, wouldn't it?  (Or users would be
> required to separately install Optik).

Or spambayes could it include its own optparse.py (the preferred name
for forwards compatibility with Python 2.3: setup.py should just neglect
to install optparse.py if run by Python 2.3).  Docutils and SCons
already do just that.

Please take this as constructive criticism rather than just bitching.
One of these days I'm just going to sit down and write a bunch of sb-*
scripts, so I can put the brain cells currently occupied with trying to
remember what -D means for which script to some better use.  And I would
very much like to check them into spambayes' CVS, unless someone beats
me to it.

        Greg
-- 
Greg Ward <gward at python.net>                         http://www.gerg.ca/
Yield to temptation; it may not pass your way again.



More information about the spambayes-dev mailing list