[spambayes-bugs] [ spambayes-Bugs-797064 ] Problems moving messages
between pst and Exchange
SourceForge.net
noreply at sourceforge.net
Fri Jan 21 04:30:17 CET 2005
Bugs item #797064, was opened at 2003-08-29 13:08
Message generated for change (Settings changed) made by anadelonbrin
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=797064&group_id=61702
Category: Outlook
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Tony Meyer (anadelonbrin)
>Assigned to: Tony Meyer (anadelonbrin)
Summary: Problems moving messages between pst and Exchange
Initial Comment:
If a message's clues are viewed when on the Exchange
server, and compared to the same message moved to a
pst file, the clues are not the same. It appears (I
haven't examined closely yet; can do on request) that
on Exchange the html part of the message is used, and
in the pst, it isnt'.
Probably related to this is the problem that moving a
message back and forwards between Exchange and a
pst file (showing clues each time) results in an ever-
increasing number of tokens.
It doesn't appear to be the PR_SEARCH_KEY changing:
>>> key1 = "PR_SEARCH_KEY : '\n\x02\xde\xfd7\xf6
\xa7A\x93\xfd\xf3\xb1\xfeA\x16\xf9'"
>>> key2 = "PR_SEARCH_KEY : '\n\x02\xde\xfd7\xf6
\xa7A\x93\xfd\xf3\xb1\xfeA\x16\xf9'"
>>> key1 == key2
True
Next thing to try? :)
----------------------------------------------------------------------
Comment By: Tony Meyer (anadelonbrin)
Date: 2003-09-06 15:09
Message:
Logged In: YES
user_id=552329
Are we going to be able to get identical token streams?
Attached are two 'show clues' messages, for the same
message, on a pst and on Exchange. 26 clues for one, and
28 for the other. This is a plain text message.
The extra two clues arise because Exchange html'ises the
plain text message and so the words in the subject also
appear in the body.
----------------------------------------------------------------------
Comment By: Mark Hammond (mhammond)
Date: 2003-08-31 17:34
Message:
Logged In: YES
user_id=14198
The underlying bug seems to be
https://sourceforge.net/tracker/index.php?func=detail&aid=798029&group_id=61702&atid=498103
- however, as it looks like we will be almost
"hand-crafting" the HTML of the message, I will leave this
open, as we may still end up with bugs if the html we
generate isn't identical (token-wise) to the MS one.
----------------------------------------------------------------------
Comment By: Tony Meyer (anadelonbrin)
Date: 2003-08-30 21:32
Message:
Logged In: YES
user_id=552329
The dump_props are attached.
If I just move the messages about, doing 'show clues', then
no training takes place. I think my original comment was
wrong - trying now, I get the same number of tokens no
matter how many times I move (although the exchange count
and pst count are different). Anyway, the log (at verbose=1)
doesn't show anything apart from the "already trained as
ham" message.
If I train a message I get not that much more. pst first:
"""
Training on message 'Re: comparing 2 images' - trained as
spam
Saving bayes database with 4637 spam and 410 good
messages
-> C:\Documents and Settings\tameyer.MASSEY\Application
Data\SpamBayes\default_bayes_database.db
-> C:\Documents and Settings\tameyer.MASSEY\Application
Data\SpamBayes\default_message_database.db
Saved databases in 896.138ms
"""
and moving it back to Exchange:
"""
Training on message 'Re: comparing 2 images' - trained as
good
Saving bayes database with 4636 spam and 411 good
messages
-> C:\Documents and Settings\tameyer.MASSEY\Application
Data\SpamBayes\default_bayes_database.db
-> C:\Documents and Settings\tameyer.MASSEY\Application
Data\SpamBayes\default_message_database.db
Saved databases in 850.026ms
"""
Does this help?
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=498103&aid=797064&group_id=61702
More information about the Spambayes-bugs
mailing list