[Spambayes-checkins]
spambayes/Outlook2000 addin.py,1.31,1.32 msgstore.py,1.28,1.29
Tim Peters
tim_one@users.sourceforge.net
Thu Nov 14 01:16:13 2002
Update of /cvsroot/spambayes/spambayes/Outlook2000
In directory usw-pr-cvs1:/tmp/cvs-serv32322/Outlook2000
Modified Files:
addin.py msgstore.py
Log Message:
GetEmailPackageObject(): Put pack code to strip Content-Type: turns out
there was a superb reason to do this after all, just not the one I
thought there was <wink>.
Index: addin.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/Outlook2000/addin.py,v
retrieving revision 1.31
retrieving revision 1.32
diff -C2 -d -r1.31 -r1.32
*** addin.py 12 Nov 2002 22:56:24 -0000 1.31
--- addin.py 14 Nov 2002 01:16:11 -0000 1.32
***************
*** 244,248 ****
push("<h2>Message Stream:</h2><br>")
push("<PRE>\n")
! msg = msgstore_message.GetEmailPackageObject()
push(escape(msg.as_string(), True))
push("</PRE>\n")
--- 244,248 ----
push("<h2>Message Stream:</h2><br>")
push("<PRE>\n")
! msg = msgstore_message.GetEmailPackageObject(strip_content_type=False)
push(escape(msg.as_string(), True))
push("</PRE>\n")
Index: msgstore.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/Outlook2000/msgstore.py,v
retrieving revision 1.28
retrieving revision 1.29
diff -C2 -d -r1.28 -r1.29
*** msgstore.py 13 Nov 2002 18:30:01 -0000 1.28
--- msgstore.py 14 Nov 2002 01:16:11 -0000 1.29
***************
*** 430,434 ****
self.mapi_object = self.msgstore._OpenEntry(self.id)
! def GetEmailPackageObject(self):
import email
text = self._GetMessageText()
--- 430,451 ----
self.mapi_object = self.msgstore._OpenEntry(self.id)
! def GetEmailPackageObject(self, strip_content_type=True):
! # Return an email.Message object.
! # strip_content_type is a hack, and should be left True unless you're
! # trying to display all the headers for diagnostic purposes. If we
! # figure out something better to do, it should go away entirely.
! # The problem: suppose a msg is multipart/alternative, with
! # text/plain and text/html sections. The latter MIME decorations
! # are plain missing in what _GetMessageText() returns. If we leave
! # the multipart/alternative in the headers anyway, the email
! # package's "lax parsing" won't complain about not finding any
! # sections, but since the type *is* multipart/alternative then
! # anyway, the tokenizer finds no text/* parts at all to tokenize.
! # As a result, only the headers get tokenized. By stripping
! # Content-Type from the headers (if present), the email pkg
! # considers the body to be text/plain (the default), and so it
! # does get tokenized.
! # Short course: we either have to synthesize non-insane MIME
! # structure, or eliminate all evidence of original MIME structure.
import email
text = self._GetMessageText()
***************
*** 438,441 ****
--- 455,463 ----
print "FAILED to create email.message from: ", `text`
raise
+
+ if strip_content_type:
+ if msg.has_key('content-type'):
+ del msg['content-type']
+
return msg
More information about the Spambayes-checkins
mailing list