[spambayes-bugs] [ spambayes-Bugs-780612 ] Outlook incorrectly trains on moved messages

SourceForge.net noreply at sourceforge.net
Fri Aug 8 11:11:32 EDT 2003


Bugs item #780612, was opened at 2003-07-31 10:40
Message generated for change (Comment added) made by mhammond
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=780612&group_id=61702

Category: Outlook
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Mark Hammond (mhammond)
Assigned to: Mark Hammond (mhammond)
>Summary: Outlook incorrectly trains on moved messages

Initial Comment:
Consider:

* SpamBayes set to filter "inbox" and "my folder".
* Item arrives in "inbox".  SpamBayes scores as 'ham'
* Outlook rule kicks in - message moved to "My Folder"
* SpamBayes sees this message, notices we have scored
it, and assumes it is a "recover" operation - trains as
Ham.

Not sure how to tackle this!

----------------------------------------------------------------------

>Comment By: Mark Hammond (mhammond)
Date: 2003-08-09 03:11

Message:
Logged In: YES 
user_id=14198

Checking in addin.py;
new revision: 1.89; previous revision: 1.88


----------------------------------------------------------------------

Comment By: Mark Hammond (mhammond)
Date: 2003-07-31 18:38

Message:
Logged In: YES 
user_id=14198

Except that "classification" does not imply "trained".  Eg,
a spam comes in, so we filter it.  User drags it back.  We
notice spam classification, so we train as good.  If we look
for "was previously trained", it will fail.

But yeah, I you are basically correct - we should only train
as good on a drag if the existing score did not put it in
the "good" range already.  In fact, we do already do that
for Spam - "Delete as Spam" only trains if it would not
already fall in the spam range.

In fact, we
"If I am being moved into a ham folder, and I was trained as 
spam, then I am being reclassified."


----------------------------------------------------------------------

Comment By: Tony Meyer (anadelonbrin)
Date: 2003-07-31 13:41

Message:
Logged In: YES 
user_id=552329

If you remember the classification, can't the rule simply be:

If I am being moved into a ham folder, and I was trained as 
spam, then I am being reclassified.
If I am being moved into a spam folder, and I was not trained 
as spam, then I am being reclassified.
Otherwise, I am just being moved.

(Where 'reclassified' checks to see if it already is the correct 
classification).

----------------------------------------------------------------------

Comment By: Mark Hammond (mhammond)
Date: 2003-07-31 13:33

Message:
Logged In: YES 
user_id=14198

We do remember the classification - that is how the training
is triggered.  It is *because* we remember that we do the train.

What we could do is also remember the folder we origianlly
filtered it in.  Then, when it is moved we would be able to
determine if it is being "moved back" (ie, should be
trained) or simply moved by the user (and should be ignored
- other than probably updating the name of the folder we
"trained" it in)

I should get the test case failing first.

The message database is simply a dictionary of msgid:
trained_as_spam.  No entry means "never trained", otherwise
we know how it was trained.

----------------------------------------------------------------------

Comment By: Tony Meyer (anadelonbrin)
Date: 2003-07-31 13:01

Message:
Logged In: YES 
user_id=552329

Couldn't you remember the classification for each message?  

This is what pop3proxy et al do - the message class provides 
the necessary information.  (This could be your chance to 
switch to using that and finish the rather uncomplete "master 
db" thing in there!) <wink>

What exactly is stored in the default_message_database 
anyway?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=780612&group_id=61702



More information about the Spambayes-bugs mailing list