[spambayes-bugs] [ spambayes-Bugs-741250 ] mboxtrain can truncate your mailbox

SourceForge.net noreply at sourceforge.net
Wed Jan 19 19:10:30 CET 2005


Bugs item #741250, was opened at 2003-05-21 10:04
Message generated for change (Comment added) made by olneytj
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=741250&group_id=61702

Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Skip Montanaro (montanaro)
Assigned to: Skip Montanaro (montanaro)
Summary: mboxtrain can truncate your mailbox

Initial Comment:
There is code in mboxtrain to rewrite the mailbox which looks like

    try:
        os.ftruncate(f.fileno(), 0)
        f.seek(0)
    except:
        # If anything goes wrong, don't try to write
        print "Problem truncating mbox--nothing written"
        raise

If the ftruncate() call succeeds but the seek() call fails, the user would
be left with an empty mailbox.  I think the code should write a 
temporary mailbox then rename it only if the complete write
operation is successful.  Furthermore, bare except clauses should
be avoided unless you really can't anticipate all the exceptions 
which might be raised.

No patch yet.  I'll try to come up with something.


----------------------------------------------------------------------

Comment By: K7TJO (olneytj)
Date: 2005-01-19 10:10

Message:
Logged In: YES 
user_id=1005844

This is still a problem.  If you come up with a way to patch it,
it would be wonderful.   As it is now, I have to run it from
a script that does the copy first to a new file then trains
on that. If it fails, at least I still have the original.  

It seems to truncate on some attribute of messages that used
to occur but hasn't occured in the last 4 years.  If I move
out the oldest messages into a different file, and train,
all is fine.

TJ Olney 

----------------------------------------------------------------------

Comment By: Jean-Marc Valin (jmvalin)
Date: 2004-12-15 22:45

Message:
Logged In: YES 
user_id=1494

I tried to pin down the problem. I came up with this small file:
http://people.xiph.org/~jm/spam.mbox
When doing: 
sb_mboxtrain.py -f -d spam.db -s spam.mbox
I get:
Training spam (spam.mbox):
  Reading as Unix mbox
  Trained 0 out of 0 messages
and the file gets truncated to (in this case) zero bytes.
There seem to be several emails in my spam folder that
trigger that bug. In the end, spambayes becomes completely
unusable for me since I have to way to train a database
without spending hours to remove each email that triggers
the bug. In case it matters, I'm running debian unstable and
spambayes 1.0.1.

----------------------------------------------------------------------

Comment By: Jean-Marc Valin (jmvalin)
Date: 2004-12-15 22:06

Message:
Logged In: YES 
user_id=1494

I'm not sure whether my problem is related to that bug, but
it seems close. When I train with mboxtrain, my spam folder
gets truncated too. From the more than 10,000 emails in the
folder, only 433 are trained and at the end, the mbox file
is shortened from 147 MB to about 3 MB. Only the spam folder
seems to be affected. I tried using -g instead of -s (to see
if it's with the file or the flag) and the problem is still
there.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=741250&group_id=61702


More information about the Spambayes-bugs mailing list