[Spambayes-checkins] spambayes/scripts sb_pop3dnd.py,1.5,1.6

Tue Dec 30 21:59:54 EST 2003

Update of /cvsroot/spambayes/spambayes/scripts
In directory sc8-pr-cvs1:/tmp/cvs-serv24824/scripts

Modified Files:
	sb_pop3dnd.py 
Log Message:
Update comments.

Fix fetching an envelope (twisted changed from uppercase to lowercase)

Handle storing no flags.

Update the RETR'ing of messages to reflect what sb_server currently does.

<shock> This appears to work with Outlook Express now.  I think that's the
first time that it's fully worked!

Index: sb_pop3dnd.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/scripts/sb_pop3dnd.py,v
retrieving revision 1.5
retrieving revision 1.6
diff -C2 -d -r1.5 -r1.6
*** sb_pop3dnd.py	31 Dec 2003 01:05:32 -0000	1.5
--- sb_pop3dnd.py	31 Dec 2003 02:59:51 -0000	1.6
***************
*** 18,22 ****
  via the web interface, and you are ready to go.  Good messages will appear
  as per normal, but you will also have two new incoming folders, one for
! spam and one for ham.

  To train SpamBayes, use the spam folder, and the 'train_as_ham' folder.
--- 18,22 ----
  via the web interface, and you are ready to go.  Good messages will appear
  as per normal, but you will also have two new incoming folders, one for
! spam and one for unsure messages.

  To train SpamBayes, use the spam folder, and the 'train_as_ham' folder.
***************
*** 38,52 ****
  This SpamBayes application is designed to work with Outlook Express, and
  provide the same sort of ease of use as the Outlook plugin.  Although the
! majority of development and testing has been done with Outlook Express,
! any mail client that supports both IMAP and POP3 should be able to use this
! application - if the client enables the user to work with an IMAP account
! and POP3 account side-by-side (and move messages between them), then it
! should work equally as well as Outlook Express.

  This module includes the following classes:
   o IMAPFileMessage
   o IMAPFileMessageFactory
   o IMAPMailbox
   o SpambayesMailbox
   o Trainer
   o SpambayesAccount
--- 38,55 ----
  This SpamBayes application is designed to work with Outlook Express, and
  provide the same sort of ease of use as the Outlook plugin.  Although the
! majority of development and testing has been done with Outlook Express and
! Eudora, any mail client that supports both IMAP and POP3 should be able to
! use this application - if the client enables the user to work with an IMAP
! account and POP3 account side-by-side (and move messages between them),
! then it should work equally as well.

  This module includes the following classes:
+  o IMAPMessage
+  o DynamicIMAPMessage
   o IMAPFileMessage
   o IMAPFileMessageFactory
   o IMAPMailbox
   o SpambayesMailbox
+  o SpambayesInbox
   o Trainer
   o SpambayesAccount
***************
*** 61,65 ****
   o Message flags are currently not persisted, but should be.  The
     IMAPFileMessage class should be extended to do this.  The same
!    goes for the 'internaldate' of the message.
   o The RECENT flag should be unset at some point, but when?  The
     RFC says that a message is recent if this is the first session
--- 64,69 ----
   o Message flags are currently not persisted, but should be.  The
     IMAPFileMessage class should be extended to do this.  The same
!    goes for the 'internaldate' of the message.  These could be put
!    in the message info database, no doubt.
   o The RECENT flag should be unset at some point, but when?  The
     RFC says that a message is recent if this is the first session
***************
*** 80,83 ****
--- 84,92 ----
     that we kick off, and if it dies, we should die too.  Need to figure
     out how to do this in twisted.
+  o Apparently, twisted.internet.app is deprecated, and we should
+    use twisted.application instead.  Need to figure out what that means!
+  o We could have a distinction between messages classified as spam
+    and messages to train as spam.  At the moment we force users into
+    the 'incremental training' system available with the Outlook plug-in.
   o Suggestions?
  """
***************
*** 158,163 ****
          headers = {}
          for header, value in self.items():
!             if (header.upper() in names and not negate) or names == ():
!                 headers[header.upper()] = value
          return headers

--- 167,172 ----
          headers = {}
          for header, value in self.items():
!             if (header.lower() in names and not negate) or names == ():
!                 headers[header.lower()] = value
          return headers

***************
*** 547,551 ****
              elif mode == 1:
                  value = True
!             for flag in flags:
                  if flag == '(' or flag == ')':
                      continue
--- 556,560 ----
              elif mode == 1:
                  value = True
!             for flag in flags or (): # flags might be None
                  if flag == '(' or flag == ')':
                      continue
***************
*** 591,594 ****
--- 600,604 ----
          """Create the special messages that live in this mailbox."""
          state.buildStatusStrings()
+         # This about message could have a bit more content!
          about = 'Subject: About SpamBayes\r\n' \
                   'From: "SpamBayes" <no-reply at localhost>\r\n\r\n' \
***************
*** 604,607 ****
--- 614,620 ----
          # XXX information from sb_server homepage about number
          # XXX   of messages classified etc.
+         # XXX one with a link to the configuration page
+         # XXX   (or maybe even the configuration page itself,
+         # XXX    in html!)

      def isWriteable(self):
***************
*** 738,746 ****
      """

!     intercept_message = 'From: "Spambayes" <no-reply at localhost>\n' \
!                         'Subject: Spambayes Intercept\n\nA message ' \
!                         'was intercepted by Spambayes (it scored %s).\n' \
!                         '\nYou may find it in the Spam or Unsure ' \
!                         'folder.\n\n.\n'

      def __init__(self, clientSocket, serverName, serverPort, spam, unsure):
--- 751,763 ----
      """

!     # This message could be a bit more informative - it could at least
!     # say whether it's the spam or unsure folder.  It could give
!     # information about who the message was from, or what the subject
!     # was, if people thought that would be a good idea.
!     intercept_message = 'From: "Spambayes" <no-reply at localhost>\r\n' \
!                         'Subject: Spambayes Intercept\r\n\r\nA message ' \
!                         'was intercepted by Spambayes (it scored %s).\r\n' \
!                         '\r\nYou may find it in the Spam or Unsure ' \
!                         'folder.\r\n\r\n'

      def __init__(self, clientSocket, serverName, serverPort, spam, unsure):
***************
*** 789,817 ****
          pass it through.  If the result is an unsure or spam, move it
          to the appropriate IMAP folder."""
          # Use '\n\r?\n' to detect the end of the headers in case of
          # broken emails that don't use the proper line separators.
          if re.search(r'\n\r?\n', response):
              # Break off the first line, which will be '+OK'.
              ok, messageText = response.split('\n', 1)

!             prob = state.bayes.spamprob(tokenize(messageText))
!             if prob < options["Categorization", "ham_cutoff"]:
!                 # Return the +OK and the message with the header added.
!                 state.numHams += 1
!                 return ok + "\n" + messageText
!             elif prob > options["Categorization", "spam_cutoff"]:
!                 dest_folder = self.spam_folder
!                 state.numSpams += 1
!             else:
!                 dest_folder = self.unsure_folder
!                 state.numUnsure += 1
!             msg = StringIO.StringIO(messageText)
!             date = imaplib.Time2Internaldate(time.time())[1:-1]
!             dest_folder.addMessage(msg, (), date)

!             # We have to return something, because the client is expecting
!             # us to.  We return a short message indicating that a message
!             # was intercepted.
!             return ok + "\n" + self.intercept_message % (prob,)
          else:
              # Must be an error response.
--- 806,880 ----
          pass it through.  If the result is an unsure or spam, move it
          to the appropriate IMAP folder."""
+         # XXX This is all almost from sb_server!  We could just
+         # XXX extract that out into a function and call it here.
+ 
          # Use '\n\r?\n' to detect the end of the headers in case of
          # broken emails that don't use the proper line separators.
          if re.search(r'\n\r?\n', response):
+             # Remove the trailing .\r\n before passing to the email parser.
+             # Thanks to Scott Schlesier for this fix.
+             terminatingDotPresent = (response[-4:] == '\n.\r\n')
+             if terminatingDotPresent:
+                 response = response[:-3]
+ 
              # Break off the first line, which will be '+OK'.
              ok, messageText = response.split('\n', 1)

!             try:
!                 msg = message.SBHeaderMessage()
!                 msg.setPayload(messageText)
!                 # Now find the spam disposition and add the header.
!                 (prob, clues) = state.bayes.spamprob(msg.asTokens(),\
!                                  evidence=True)

!                 msg.addSBHeaders(prob, clues)
! 
!                 # Check for "RETR" or "TOP N 99999999" - fetchmail without
!                 # the 'fetchall' option uses the latter to retrieve messages.
!                 if (command == 'RETR' or
!                     (command == 'TOP' and
!                      len(args) == 2 and args[1] == '99999999')):
!                     cls = msg.GetClassification()
!                     dest_folder = None
!                     if cls == options["Headers", "header_ham_string"]:
!                         state.numHams += 1
!                         headers = []
!                         for name, value in msg.items():
!                             header = "%s: %s" % (name, value)
!                             headers.append(re.sub(r'\r?\n', '\r\n', header))
!                         body = re.split(r'\n\r?\n', messageText, 1)[1]
!                         messageText = "\r\n".join(headers) + "\r\n\r\n" + body
!                     elif prob > options["Categorization", "spam_cutoff"]:
!                         dest_folder = self.spam_folder
!                         state.numSpams += 1
!                     else:
!                         dest_folder = self.unsure_folder
!                         state.numUnsure += 1
!                     if dest_folder:
!                         msg = StringIO.StringIO(msg.as_string())
!                         date = imaplib.Time2Internaldate(time.time())[1:-1]
!                         dest_folder.addMessage(msg, (), date)
! 
!                         # We have to return something, because the client
!                         # is expecting us to.  We return a short message
!                         # indicating that a message was intercepted.
!                         messageText = self.intercept_message % (prob,)
!             except:
!                 stream = cStringIO.StringIO()
!                 traceback.print_exc(None, stream)
!                 details = stream.getvalue()
!                 detailLines = details.strip().split('\n')
!                 dottedDetails = '\n.'.join(detailLines)
!                 headerName = 'X-Spambayes-Exception'
!                 header = Header(dottedDetails, header_name=headerName)
!                 headers, body = re.split(r'\n\r?\n', messageText, 1)
!                 header = re.sub(r'\r?\n', '\r\n', str(header))
!                 headers += "\n%s: %s\r\n\r\n" % (headerName, header)
!                 messageText = headers + body
!                 print >>sys.stderr, details
!             retval = ok + "\n" + messageText
!             if terminatingDotPresent:
!                 retval += '.\r\n'
!             return retval
          else:
              # Must be an error response.