[spambayes-dev] Code changes

Tim Peters tim.one at comcast.net
Sat Dec 27 22:07:56 EST 2003


The meaning of Outlook2000/export.py's -n option has changed.  Here's the
checkin comment:

    INCOMPATIBLE CHANGE:  the -n option now gives the number of Set
    subdirectories desired, instead of a number of msgs per Set subdir
    "to shoot for".  If you want to run, e.g., 10-fold cross-validation,
    you have to have exactly 10 Set folders, and the # of msgs per folder
    is of much less importance.  Also added a note recommending to run
    rebal.py afterwards.  rebal is the expert in setting up randomized
    Set subdirectories, and the export.py script probably should have
    stuck to just extracting msgs from Outlook.

utilities/rebal.py has grown a -t option, which makes it (once again) easy
to use with a standard test setup.  It was originally easy to use that way,
but grew -r and -s options, presumably added by someone with a non-standard
test setup.  Unfortunately, those with a standard test setup had to use them
too, and they're both clumsy and error-prone to use with a standard test
setup.  -t can't be used in the same run with -r or -s.  Those with a
standard test setup no longer need to worry about -r or -s, just -t; vice
versa for those with a non-standard test setup.

The changes to testtools/sort+group.py discussed here have been checked in,
after fiddling to play nice with Python 2.2.3 too.




More information about the spambayes-dev mailing list