[spambayes-dev] RE: [Spambayes] spambayes-1.0a6 bug:
sb_mboxtrain.py fails to mark mail data as X-Spambayes-Trained
Alan W. Irwin
airwin at users.sourceforge.net
Mon Oct 13 11:45:30 EDT 2003
On 2003-10-13 17:22+1300 Tony Meyer wrote:
> > I have chosen a two-message mbox folder called libtool as an
> > example, but I get the same result with larger folders as well.
> [...]
> > and no extra mail header line referring to X-Spambayes-Trained
>
> I believe if you add
> """
> if is_spam:
> spamtxt = options["Headers", "header_spam_string"]
> else:
> spamtxt = options["Headers", "header_ham_string"]
> msg.add_header(options["Headers", "trained_header_name"],
> spamtxt)
> """
> At line 160 of sb_mboxtrain.py, this will have the desired effect.
>
> Spambayes-dev people - is this a bug? The maildir_train() function adds
> this header, but the mbox_train() function doesn't, although it looks like
> it is meant to.
>
> =Tony Meyer
>
Actually, that fix won't work in the 1.0a6 version of the code since
the add_header part is done by msg_train which is called by _both_
maildir_train and mbox_train. However, in researching this further I found
the actual source of the problem, and here is a simple patch to fix it.
--- sb_mboxtrain.py_original Fri Oct 10 19:55:06 2003
+++ sb_mboxtrain.py Mon Oct 13 08:18:48 2003
@@ -157,11 +157,11 @@
sys.stdout.flush()
if msg_train(h, msg, is_spam, force):
trained += 1
- if not options["Headers", "include_trained"]:
+ if options["Headers", "include_trained"]:
# Write it out with the Unix "From " line
outf.write(msg.as_string(True))
- if not options["Headers", "include_trained"]:
+ if options["Headers", "include_trained"]:
outf.seek(0)
try:
os.ftruncate(f.fileno(), 0)
The problem is the sense of the include_trained flag is taken the wrong
way in the original code for mbox_train. (The equivalent code in
maildir_train continues, ie skips the write part of the loop which is the
correct sense of the include_trained flag so no changes are needed in
that case.)
I have tested the new code for the same simple case:
irwin at starling> python2.3 /usr/local/bin/sb_mboxtrain.py -d $HOME/.spambayes/hammie.dbm -g /home/irwin/cdburn0/Mail/libtool
Training ham (/home/irwin/cdburn0/Mail/libtool):
Reading as Unix mbox
Trained 2 out of 2 messages
irwin at starling> python2.3 /usr/local/bin/sb_mboxtrain.py -d $HOME/.spambayes/hammie.dbm -g /home/irwin/cdburn0/Mail/libtool
Training ham (/home/irwin/cdburn0/Mail/libtool):
Reading as Unix mbox
Trained 0 out of 2 messages
Note, that on the second time around the number of trained messages is zero
as it should be with the revised code. With the original code it is 2 in
error. Also, inspection of the libtool mbox shows the header has
now been written (the original code did not change libtool in the slightest)
using the
X-Spambayes-Trained: ham
header line.
To the developers here:
Please accept the above patch as a fix for bug # 821808, and close the bug.
Tony, will you please make sure this gets posted to spambayes_dev? I
have not subscribed to that list so it will presumably take moderator
approval.
Alan
__________________________
Alan W. Irwin
email: irwin at beluga.phys.uvic.ca
phone: 250-727-2902
Astronomical research affiliation with Department of Physics and Astronomy,
University of Victoria (astrowww.phys.uvic.ca).
Programming affiliations with the PLplot scientific plotting software
package (plplot.org), the Yorick front-end to PLplot (yplot.sf.net), the
Loads of Linux Links project (loll.sf.net), and the Linux Brochure Project
(lbproject.sf.net).
__________________________
Linux-powered Science
__________________________
More information about the spambayes-dev
mailing list