From anadelonbrin at users.sourceforge.net Fri Oct 1 02:03:22 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Fri Oct 1 02:03:26 2004 Subject: [Spambayes-checkins] spambayes/spambayes message.py,1.54,1.55 Message-ID: Update of /cvsroot/spambayes/spambayes/spambayes In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv30887/spambayes Modified Files: message.py Log Message: As per the discussion on spambayes-dev. Rather than treating the notate_to option just like notate_subject, we instead convert the classification into an email address in the form (eg) spam@spambayes.invalid and adds that as a separate address to the message. This has the added benefit of being a legitimate treatment of the header, and also should be easier to use in a message rule, because there should be no other mail at all that would match a rule looking for @spambayes.invalid. If the message is replied to (say it was unsure ham) then the user can either delete the invalid address, or leave it and get a bounce back (but the legit address will still get the reply). Index: message.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/spambayes/message.py,v retrieving revision 1.54 retrieving revision 1.55 diff -C2 -d -r1.54 -r1.55 *** message.py 6 Aug 2004 05:24:58 -0000 1.54 --- message.py 1 Oct 2004 00:03:19 -0000 1.55 *************** *** 433,441 **** notate_to = options["Headers", "notate_to"] if disposition in notate_to: try: ! self.replace_header("To", "%s,%s" % (disposition, ! self["To"])) except KeyError: ! self["To"] = disposition if isinstance(options["Headers", "notate_subject"], types.StringTypes): --- 433,446 ---- notate_to = options["Headers", "notate_to"] if disposition in notate_to: + # Once, we treated the To: header just like the Subject: one, + # but that doesn't really make sense - and OE stripped the + # comma that we added, treating it as a separator, so it + # wasn't much use anyway. So we now convert the classification + # to an invalid address, and add that. + address = "%s@spambayes.invalid" % (disposition, ) try: ! self.replace_header("To", "%s;%s" % (address, self["To"])) except KeyError: ! self["To"] = address if isinstance(options["Headers", "notate_subject"], types.StringTypes): From kpitt at users.sourceforge.net Fri Oct 1 16:31:38 2004 From: kpitt at users.sourceforge.net (Kenny Pitt) Date: Fri Oct 1 16:31:43 2004 Subject: [Spambayes-checkins] spambayes/Outlook2000/dialogs/resources dialogs.rc, 1.45, 1.46 Message-ID: Update of /cvsroot/spambayes/spambayes/Outlook2000/dialogs/resources In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv17124/dialogs/resources Modified Files: dialogs.rc Log Message: Change "Delete as Spam" button to "Spam" and "Recover from Spam" button to "Not Spam". Index: dialogs.rc =================================================================== RCS file: /cvsroot/spambayes/spambayes/Outlook2000/dialogs/resources/dialogs.rc,v retrieving revision 1.45 retrieving revision 1.46 diff -C2 -d -r1.45 -r1.46 *** dialogs.rc 28 Apr 2004 22:30:12 -0000 1.45 --- dialogs.rc 1 Oct 2004 14:31:35 -0000 1.46 *************** *** 184,188 **** LTEXT "SpamBayes is now configured and ready to start learning about your Spam", IDC_STATIC,20,22,247,16 ! LTEXT "As SpamBayes has not been trained, all new mail will arrive in your Unsure folder. As each message arrives, you should use the 'Delete as Spam' or 'Recover from Spam' toolbar buttons as appropriate.", IDC_STATIC,20,42,247,27 LTEXT "If you wish to speed up the training process, you can move all the existing Spam from your Inbox to the new Spam folder, then select 'Training' from the SpamBayes manager.", --- 184,188 ---- LTEXT "SpamBayes is now configured and ready to start learning about your Spam", IDC_STATIC,20,22,247,16 ! LTEXT "As SpamBayes has not been trained, all new mail will arrive in your Unsure folder. As each message arrives, you should use the 'Spam' or 'Not Spam' toolbar buttons as appropriate.", IDC_STATIC,20,42,247,27 LTEXT "If you wish to speed up the training process, you can move all the existing Spam from your Inbox to the new Spam folder, then select 'Training' from the SpamBayes manager.", *************** *** 289,293 **** LTEXT "SpamBayes has been successfully trained and configured. You should find the system is immediately effective at filtering spam.", IDC_TRAINING_STATUS,20,35,247,26 ! LTEXT "Even though SpamBayes has been trained, it does continue to learn - please ensure you regularly check your Unsure folder, and use the 'Delete as Spam' or 'Recover from Spam' buttons as appropriate.", IDC_STATIC,20,68,249,30 LTEXT "Click Finish to close the wizard.",IDC_STATIC,20,104, --- 289,293 ---- LTEXT "SpamBayes has been successfully trained and configured. You should find the system is immediately effective at filtering spam.", IDC_TRAINING_STATUS,20,35,247,26 ! LTEXT "Even though SpamBayes has been trained, it does continue to learn - please ensure you regularly check your Unsure folder, and use the 'Spam' or 'Not Spam' buttons as appropriate.", IDC_STATIC,20,68,249,30 LTEXT "Click Finish to close the wizard.",IDC_STATIC,20,104, *************** *** 305,309 **** LTEXT "SpamBayes is a system that learns about good and bad mail based on examples you provide. It comes with no built-in rules, so must have some training information before it will be effective.", IDC_STATIC,11,21,263,30 ! LTEXT "In this case, SpamBayes will begin by filtering all mail to an 'Unsure' folder. You can then use the 'Delete as Spam' and 'Recover from Spam' buttons to train each message as it arrives. Slowly SpamBayes will learn about your mail.", IDC_STATIC,22,61,252,29 LTEXT "This option will close the wizard, and provide instructions how to sort your mail. You will then be able to configure SpamBayes and have it be immediately effective at filtering your mail", --- 305,309 ---- LTEXT "SpamBayes is a system that learns about good and bad mail based on examples you provide. It comes with no built-in rules, so must have some training information before it will be effective.", IDC_STATIC,11,21,263,30 ! LTEXT "In this case, SpamBayes will begin by filtering all mail to an 'Unsure' folder. You can then use the 'Spam' and 'Not Spam' buttons to train each message as it arrives. Slowly SpamBayes will learn about your mail.", IDC_STATIC,22,61,252,29 LTEXT "This option will close the wizard, and provide instructions how to sort your mail. You will then be able to configure SpamBayes and have it be immediately effective at filtering your mail", *************** *** 539,543 **** IDC_BUT_TRAIN_FROM_SPAM_FOLDER,"Button",BS_AUTOCHECKBOX | BS_MULTILINE | WS_TABSTOP,11,127,204,18 ! LTEXT "Clicking Recover From Spam should",IDC_STATIC,10,148, 115,10 COMBOBOX IDC_RECOVER_RS,127,145,114,54,CBS_DROPDOWNLIST | --- 539,543 ---- IDC_BUT_TRAIN_FROM_SPAM_FOLDER,"Button",BS_AUTOCHECKBOX | BS_MULTILINE | WS_TABSTOP,11,127,204,18 ! LTEXT "Clicking 'Not Spam' should",IDC_STATIC,10,148, 115,10 COMBOBOX IDC_RECOVER_RS,127,145,114,54,CBS_DROPDOWNLIST | *************** *** 546,550 **** IDC_BUT_TRAIN_TO_SPAM_FOLDER,"Button",BS_AUTOCHECKBOX | BS_MULTILINE | WS_TABSTOP,11,163,204,16 ! LTEXT "Clicking Delete as Spam should",IDC_STATIC,10,183,104, 10 COMBOBOX IDC_DEL_SPAM_RS,127,180,114,54,CBS_DROPDOWNLIST | --- 546,550 ---- IDC_BUT_TRAIN_TO_SPAM_FOLDER,"Button",BS_AUTOCHECKBOX | BS_MULTILINE | WS_TABSTOP,11,163,204,16 ! LTEXT "Clicking 'Spam' should",IDC_STATIC,10,183,104, 10 COMBOBOX IDC_DEL_SPAM_RS,127,180,114,54,CBS_DROPDOWNLIST | From kpitt at users.sourceforge.net Fri Oct 1 16:31:39 2004 From: kpitt at users.sourceforge.net (Kenny Pitt) Date: Fri Oct 1 16:31:43 2004 Subject: [Spambayes-checkins] spambayes/Outlook2000 addin.py, 1.130, 1.131 config.py, 1.30, 1.31 Message-ID: Update of /cvsroot/spambayes/spambayes/Outlook2000 In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv17124 Modified Files: addin.py config.py Log Message: Change "Delete as Spam" button to "Spam" and "Recover from Spam" button to "Not Spam". Index: addin.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/Outlook2000/addin.py,v retrieving revision 1.130 retrieving revision 1.131 diff -C2 -d -r1.130 -r1.131 *** addin.py 16 Jul 2004 15:23:10 -0000 1.130 --- addin.py 1 Oct 2004 14:31:34 -0000 1.131 *************** *** 613,617 **** print "Please delete any test messages from your Spam, Unsure or Inbox/Watch folders first." ! # The "Delete As Spam" and "Recover Spam" button # The event from Outlook's explorer that our folder has changed. class ButtonDeleteAsEventBase: --- 613,617 ---- print "Please delete any test messages from your Spam, Unsure or Inbox/Watch folders first." ! # The "Spam" and "Not Spam" buttons # The event from Outlook's explorer that our folder has changed. class ButtonDeleteAsEventBase: *************** *** 636,640 **** self.manager.ReportError( "You must configure and enable SpamBayes before you can " \ ! "delete as spam") return SetWaitCursor(1) --- 636,640 ---- self.manager.ReportError( "You must configure and enable SpamBayes before you can " \ ! "mark messages as spam") return SetWaitCursor(1) *************** *** 695,699 **** self.manager.ReportError( "You must configure and enable SpamBayes before you can " \ ! "recover spam") return SetWaitCursor(1) --- 695,699 ---- self.manager.ReportError( "You must configure and enable SpamBayes before you can " \ ! "mark messages as not spam") return SetWaitCursor(1) *************** *** 742,746 **** except msgstore.NotFoundException: # Message moved under us - ignore. ! self.manager.LogDebug(1, "Recover from spam had message moved from underneath us - ignored") # Note the move will possibly also trigger a re-train # but we are smart enough to know we have already done it. --- 742,746 ---- except msgstore.NotFoundException: # Message moved under us - ignore. ! self.manager.LogDebug(1, "'Not Spam' had message moved from underneath us - ignored") # Note the move will possibly also trigger a re-train # but we are smart enough to know we have already done it. *************** *** 792,796 **** assert self.toolbar is None, "Should not yet have a toolbar" ! # Add our "Delete as ..." and "Recover from" buttons tt_text = "Move the selected message to the Spam folder,\n" \ "and train the system that this is Spam." --- 792,796 ---- assert self.toolbar is None, "Should not yet have a toolbar" ! # Add our "Spam" and "Not Spam" buttons tt_text = "Move the selected message to the Spam folder,\n" \ "and train the system that this is Spam." *************** *** 799,808 **** constants.msoControlButton, ButtonDeleteAsSpamEvent, (self.manager, self), ! Caption="Delete As Spam", TooltipText = tt_text, BeginGroup = False, Tag = "SpamBayesCommand.DeleteAsSpam", image = "delete_as_spam.bmp") ! # And again for "Recover from" tt_text = \ "Recovers the selected item back to the folder\n" \ --- 799,808 ---- constants.msoControlButton, ButtonDeleteAsSpamEvent, (self.manager, self), ! Caption="Spam", TooltipText = tt_text, BeginGroup = False, Tag = "SpamBayesCommand.DeleteAsSpam", image = "delete_as_spam.bmp") ! # And again for "Not Spam" tt_text = \ "Recovers the selected item back to the folder\n" \ *************** *** 814,818 **** constants.msoControlButton, ButtonRecoverFromSpamEvent, (self.manager, self), ! Caption="Recover from Spam", TooltipText = tt_text, Tag = "SpamBayesCommand.RecoverFromSpam", --- 814,818 ---- constants.msoControlButton, ButtonRecoverFromSpamEvent, (self.manager, self), ! Caption="Not Spam", TooltipText = tt_text, Tag = "SpamBayesCommand.RecoverFromSpam", Index: config.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/Outlook2000/config.py,v retrieving revision 1.30 retrieving revision 1.31 diff -C2 -d -r1.30 -r1.31 *** config.py 4 May 2004 01:56:34 -0000 1.30 --- config.py 1 Oct 2004 14:31:34 -0000 1.31 *************** *** 97,105 **** PATH, DO_NOT_RESTORE), ("delete_as_spam_message_state", "How the 'read' flag on a message is modified", "None", ! """When the 'Delete as Spam' function is used, the message 'read' flag can also be set.""", MSG_READ_STATE, RESTORE), ("recover_from_spam_message_state", "How the 'read' flag on a message is modified", "None", ! """When the 'Recover from Spam' function is used, the message 'read' flag can also be set.""", MSG_READ_STATE, RESTORE), --- 97,105 ---- PATH, DO_NOT_RESTORE), ("delete_as_spam_message_state", "How the 'read' flag on a message is modified", "None", ! """When the 'Spam' function is used, the message 'read' flag can also be set.""", MSG_READ_STATE, RESTORE), ("recover_from_spam_message_state", "How the 'read' flag on a message is modified", "None", ! """When the 'Not Spam' function is used, the message 'read' flag can also be set.""", MSG_READ_STATE, RESTORE), From kpitt at users.sourceforge.net Fri Oct 1 16:31:39 2004 From: kpitt at users.sourceforge.net (Kenny Pitt) Date: Fri Oct 1 16:31:43 2004 Subject: [Spambayes-checkins] spambayes/Outlook2000/docs configuration.html, 1.9, 1.10 welcome.html, 1.9, 1.10 Message-ID: Update of /cvsroot/spambayes/spambayes/Outlook2000/docs In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv17124/docs Modified Files: configuration.html welcome.html Log Message: Change "Delete as Spam" button to "Spam" and "Recover from Spam" button to "Not Spam". Index: configuration.html =================================================================== RCS file: /cvsroot/spambayes/spambayes/Outlook2000/docs/configuration.html,v retrieving revision 1.9 retrieving revision 1.10 diff -C2 -d -r1.9 -r1.10 *** configuration.html 5 Sep 2003 11:49:47 -0000 1.9 --- configuration.html 1 Oct 2004 14:31:35 -0000 1.10 *************** *** 120,124 **** each message as it is filtered or scored.  Note that if this option is disabled, then ! the Recover From Spam function may recover messages back to the Inbox rather than the folder it was filtered on (as the originating folder is part of the spam --- 120,124 ---- each message as it is filtered or scored.  Note that if this option is disabled, then ! the Not Spam function may recover messages back to the Inbox rather than the folder it was filtered on (as the originating folder is part of the spam *************** *** 150,154 **** Determines how to set the "Read" ! state of a message as they are manually managed by the "Delete as Spam" button.  By default, the 'Read' state of the message is not changed, but this allows you to explicitly change it to either 'read' --- 150,154 ---- Determines how to set the "Read" ! state of a message as they are manually managed by the "Spam" button.  By default, the 'Read' state of the message is not changed, but this allows you to explicitly change it to either 'read' *************** *** 169,173 **** Determines how to set the "Read" state of a message as they are ! manually managed by the "Recover from Spam" button.  By default, the 'Read' state of the message is not changed, but this allows you to --- 169,173 ---- Determines how to set the "Read" state of a message as they are ! manually managed by the "Not Spam" button.  By default, the 'Read' state of the message is not changed, but this allows you to Index: welcome.html =================================================================== RCS file: /cvsroot/spambayes/spambayes/Outlook2000/docs/welcome.html,v retrieving revision 1.9 retrieving revision 1.10 diff -C2 -d -r1.9 -r1.10 *** welcome.html 27 Jul 2004 14:44:43 -0000 1.9 --- welcome.html 1 Oct 2004 14:31:35 -0000 1.10 *************** *** 106,112 ****
  • A spam stays in your inbox. This is known as a false negative. In this case ! you can either drag the message to the Spam folder or click on the Delete as Spam button on the ! Outlook toolbar. In both cases, the message will be trained as spam and will be moved to the spam folder.
  • --- 106,111 ----
  • A spam stays in your inbox. This is known as a false negative. In this case ! you can either drag the message to the Spam folder or click on the ! Spam button on the Outlook toolbar. In both cases, the message will be trained as spam and will be moved to the spam folder.
  • *************** *** 115,123 **** possible spam folder for human review. All unsure messages should be manually classified; good messages can either be dragged back to the ! inbox, or have the Recover from Spam toolbar button clicked, while spam messages can either be dragged to the ! Spam folder or have the Delete as ! Spam toolbar button (shown above) clicked. In all cases, the system will --- 114,121 ---- possible spam folder for human review. All unsure messages should be manually classified; good messages can either be dragged back to the ! inbox, or have the Not Spam toolbar button clicked, while spam messages can either be dragged to the ! Spam folder or have the Spam toolbar button (shown above) clicked. In all cases, the system will *************** *** 127,132 **** as a false positive. In this case you can either drag the message back to the folder from which ! it came (generally the inbox), or click on the Recover from Spam button. In both cases the message will be trained as good, and moved back to the original folder. --- 125,130 ---- as a false positive. In this case you can either drag the message back to the folder from which ! it came (generally the inbox), or click on the Not Spam ! button. In both cases the message will be trained as good, and moved back to the original folder. From kpitt at users.sourceforge.net Fri Oct 1 16:37:40 2004 From: kpitt at users.sourceforge.net (Kenny Pitt) Date: Fri Oct 1 16:37:43 2004 Subject: [Spambayes-checkins] spambayes/Outlook2000/dialogs/resources dialogs.rc, 1.46, 1.47 Message-ID: Update of /cvsroot/spambayes/spambayes/Outlook2000/dialogs/resources In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv18241 Modified Files: dialogs.rc Log Message: Add the word "button" to make the "Clicking 'Spam' button should" and "Clicking 'Not Spam' button should" labels a little clearer. Index: dialogs.rc =================================================================== RCS file: /cvsroot/spambayes/spambayes/Outlook2000/dialogs/resources/dialogs.rc,v retrieving revision 1.46 retrieving revision 1.47 diff -C2 -d -r1.46 -r1.47 *** dialogs.rc 1 Oct 2004 14:31:35 -0000 1.46 --- dialogs.rc 1 Oct 2004 14:37:37 -0000 1.47 *************** *** 539,543 **** IDC_BUT_TRAIN_FROM_SPAM_FOLDER,"Button",BS_AUTOCHECKBOX | BS_MULTILINE | WS_TABSTOP,11,127,204,18 ! LTEXT "Clicking 'Not Spam' should",IDC_STATIC,10,148, 115,10 COMBOBOX IDC_RECOVER_RS,127,145,114,54,CBS_DROPDOWNLIST | --- 539,543 ---- IDC_BUT_TRAIN_FROM_SPAM_FOLDER,"Button",BS_AUTOCHECKBOX | BS_MULTILINE | WS_TABSTOP,11,127,204,18 ! LTEXT "Clicking 'Not Spam' button should",IDC_STATIC,10,148, 115,10 COMBOBOX IDC_RECOVER_RS,127,145,114,54,CBS_DROPDOWNLIST | *************** *** 546,550 **** IDC_BUT_TRAIN_TO_SPAM_FOLDER,"Button",BS_AUTOCHECKBOX | BS_MULTILINE | WS_TABSTOP,11,163,204,16 ! LTEXT "Clicking 'Spam' should",IDC_STATIC,10,183,104, 10 COMBOBOX IDC_DEL_SPAM_RS,127,180,114,54,CBS_DROPDOWNLIST | --- 546,550 ---- IDC_BUT_TRAIN_TO_SPAM_FOLDER,"Button",BS_AUTOCHECKBOX | BS_MULTILINE | WS_TABSTOP,11,163,204,16 ! LTEXT "Clicking 'Spam' button should",IDC_STATIC,10,183,104, 10 COMBOBOX IDC_DEL_SPAM_RS,127,180,114,54,CBS_DROPDOWNLIST | From sjoerd at users.sourceforge.net Fri Oct 1 22:30:55 2004 From: sjoerd at users.sourceforge.net (Sjoerd Mullender) Date: Fri Oct 1 22:30:59 2004 Subject: [Spambayes-checkins] spambayes/scripts sb_imapfilter.py,1.39,1.40 Message-ID: Update of /cvsroot/spambayes/spambayes/scripts In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv25329 Modified Files: sb_imapfilter.py Log Message: Quote the search string that tries to find the message again that was just saved. Some spam mails have illegal values (such as space) in the Message-Id, and this caused a crash in this script. Index: sb_imapfilter.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/scripts/sb_imapfilter.py,v retrieving revision 1.39 retrieving revision 1.40 diff -C2 -d -r1.39 -r1.40 *** sb_imapfilter.py 30 Sep 2004 05:16:30 -0000 1.39 --- sb_imapfilter.py 1 Oct 2004 20:30:52 -0000 1.40 *************** *** 559,564 **** # have to use it for IMAP operations. self.imap_server.SelectFolder(self.folder.name) ! search_string = "(UNDELETED HEADER %s %s)" % \ ! (options["Headers", "mailid_header_name"], self.id) response = self.imap_server.uid("SEARCH", search_string) data = self.imap_server.check_response("search " + search_string, --- 559,565 ---- # have to use it for IMAP operations. self.imap_server.SelectFolder(self.folder.name) ! search_string = "(UNDELETED HEADER %s \"%s\")" % \ ! (options["Headers", "mailid_header_name"], ! self.id.replace('\\',r'\\').replace('"',r'\"')) response = self.imap_server.uid("SEARCH", search_string) data = self.imap_server.check_response("search " + search_string, From anadelonbrin at users.sourceforge.net Thu Oct 7 08:10:00 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Thu Oct 7 08:10:03 2004 Subject: [Spambayes-checkins] spambayes/scripts sb_upload.py,1.3,1.4 Message-ID: Update of /cvsroot/spambayes/spambayes/scripts In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv21788/scripts Modified Files: sb_upload.py Log Message: Add patch from Graham Ashton to allow users of sb_upload.py to not just upload, but also train. Index: sb_upload.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/scripts/sb_upload.py,v retrieving revision 1.3 retrieving revision 1.4 diff -C2 -d -r1.3 -r1.4 *** sb_upload.py 15 Sep 2004 06:53:58 -0000 1.3 --- sb_upload.py 7 Oct 2004 06:09:57 -0000 1.4 *************** *** 12,15 **** --- 12,16 ---- usage: %(progname)s [-h] [-n] [-s server] [-p port] [-r N] [-o section:option:value] + [-t (ham|spam)] [-o section:option:value] Options: *************** *** 19,22 **** --- 20,24 ---- -p, --port= - provide alternate server port (default %(port)s) -r, --prob= - feed the message to the trainer w/ prob N [0.0...1.0] + -t, --train= - train the message (pass either 'ham' or 'spam') -o, --option= - set [section, option] in the options database to value """ *************** *** 101,109 **** port = options["html_ui", "port"] prob = 1.0 try: ! opts, args = getopt.getopt(argv, "hns:p:r:o:", ["help", "null", "server=", "port=", ! "prob=", "option="]) except getopt.error: usage(globals(), locals()) --- 103,112 ---- port = options["html_ui", "port"] prob = 1.0 + train_as = None try: ! opts, args = getopt.getopt(argv, "hns:p:r:t:o:", ["help", "null", "server=", "port=", ! "prob=", "train=", "option="]) except getopt.error: usage(globals(), locals()) *************** *** 126,129 **** --- 129,138 ---- sys.exit(1) prob = n + elif opt in ("-t", "--train"): + arg = arg.capitalize() + if arg not in ("Ham", "Spam"): + usage(globals(), locals()) + sys.exit(1) + train_as = arg elif opt in ('-o', '--option'): options.set_from_cmdline(arg, sys.stderr) *************** *** 138,143 **** if random.random() < prob: try: ! post_multipart("%s:%d"%(server,port), "/upload", [], ! [('file', 'message.dat', data)]) except: # not an error if the server isn't responding --- 147,159 ---- if random.random() < prob: try: ! if train_as is not None: ! which_text = "Train as %s" % (train_as,) ! post_multipart("%s:%d" % (server, port), "/train", ! [("which", which_text), ! ("text", "")], ! [("file", "message.dat", data)]) ! else: ! post_multipart("%s:%d" % (server,port), "/upload", [], ! [('file', 'message.dat', data)]) except: # not an error if the server isn't responding From anadelonbrin at users.sourceforge.net Wed Oct 13 01:27:31 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Wed Oct 13 01:27:35 2004 Subject: [Spambayes-checkins] spambayes/spambayes storage.py,1.41,1.42 Message-ID: Update of /cvsroot/spambayes/spambayes/spambayes In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv4406/spambayes Modified Files: storage.py Log Message: If the mySQL server doesn't support rollbacks, we weren't able to create the table, so using mySQL for storage wouldn't work. The rollback isn't essential, so if the server doesn't support it, then just create the table anyway. Index: storage.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/spambayes/storage.py,v retrieving revision 1.41 retrieving revision 1.42 diff -C2 -d -r1.41 -r1.42 *** storage.py 2 Apr 2004 18:10:52 -0000 1.41 --- storage.py 12 Oct 2004 23:27:29 -0000 1.42 *************** *** 535,539 **** c.execute("select count(*) from bayes") except MySQLdb.ProgrammingError: ! self.db.rollback() self.create_bayes() --- 535,545 ---- c.execute("select count(*) from bayes") except MySQLdb.ProgrammingError: ! try: ! self.db.rollback() ! except MySQLdb.NotSupportedError: ! # Server doesn't support rollback, so just assume that ! # we can keep going and create the db. This should only ! # happen once, anyway. ! pass self.create_bayes() From anadelonbrin at users.sourceforge.net Wed Oct 13 01:44:03 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Wed Oct 13 01:44:05 2004 Subject: [Spambayes-checkins] spambayes/spambayes ImapUI.py, 1.37, 1.38 ProxyUI.py, 1.49, 1.50 Message-ID: Update of /cvsroot/spambayes/spambayes/spambayes In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv7334/spambayes Modified Files: ImapUI.py ProxyUI.py Log Message: Now that it's not an experimental option, the use_bigrams option isn't offered anywhere on the user interface. We do want to let users enable this, so put it on the Advanced options page for sb_imapfilter and sb_server. Index: ImapUI.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/spambayes/ImapUI.py,v retrieving revision 1.37 retrieving revision 1.38 diff -C2 -d -r1.37 -r1.38 *** ImapUI.py 30 Sep 2004 02:02:58 -0000 1.37 --- ImapUI.py 12 Oct 2004 23:44:00 -0000 1.38 *************** *** 84,87 **** --- 84,88 ---- ('Classifier', 'unknown_word_prob'), ('Classifier', 'unknown_word_strength'), + ('Classifier', 'use_bigrams'), ('Header Options', None), ('Headers', 'include_score'), Index: ProxyUI.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/spambayes/ProxyUI.py,v retrieving revision 1.49 retrieving revision 1.50 diff -C2 -d -r1.49 -r1.50 *** ProxyUI.py 19 Jul 2004 09:55:21 -0000 1.49 --- ProxyUI.py 12 Oct 2004 23:44:00 -0000 1.50 *************** *** 109,112 **** --- 109,113 ---- ('Classifier', 'unknown_word_prob'), ('Classifier', 'unknown_word_strength'), + ('Classifier', 'use_bigrams'), ('Header Options', None), ('Headers', 'include_score'), From anadelonbrin at users.sourceforge.net Wed Oct 13 01:53:27 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Wed Oct 13 01:53:31 2004 Subject: [Spambayes-checkins] spambayes CHANGELOG.txt,1.46,1.47 Message-ID: Update of /cvsroot/spambayes/spambayes In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv8773 Modified Files: CHANGELOG.txt Log Message: Bring up-to-date. Index: CHANGELOG.txt =================================================================== RCS file: /cvsroot/spambayes/spambayes/CHANGELOG.txt,v retrieving revision 1.46 retrieving revision 1.47 diff -C2 -d -r1.46 -r1.47 *** CHANGELOG.txt 19 Jul 2004 03:39:24 -0000 1.46 --- CHANGELOG.txt 12 Oct 2004 23:53:24 -0000 1.47 *************** *** 1,6 **** [Note that all dates are in English, not American format - i.e. day/month/year] ! Future Release ! ============== Tony Meyer 19/07/2004 Fix [ 990700 ] Changes to asyncore in Python 2.4 break ServerLineReader Kenny Pitt 17/07/2004 Add an "Empty Spam Folder" option to the plug-in dropdown menu. (Patch [831941]) --- 1,40 ---- [Note that all dates are in English, not American format - i.e. day/month/year] ! Release 1.1a1 ! ============= ! Tony Meyer 13/10/2004 Add Classifier.use_bigrams option to the Advanced options page for sb_server and imapfilter. ! Tony Meyer 13/10/2004 Fix mySQL storage option for the case where the server does not support rollbacks. ! Tony Meyer 07/10/2004 Add patch from Graham Ashton to allow users of sb_upload.py to not just upload, but also train. ! Sjoerd Mullender 02/10/2004 imapfilter: Quote the search string that tries to find the message again that was just saved. ! Kenny Pitt 02/10/2004 Outlook: Change "Delete as Spam" button to "Spam" and "Recover from Spam" button to "Not Spam". ! Tony Meyer 01/10/2004 Instead of treating notate_to just like notate_subject, we convert the classification into an email address. ! Tony Meyer 30/09/2004 Implement [ 940643 ] Add ham_folder option ! Tony Meyer 30/09/2004 Fix [ 903905 ] IMAP Configuration Error ! Tony Meyer 29/09/2004 Fix [ 1036601 ] typo on advanced config web page ! Tony Meyer 15/09/2004 sb_upload: Clarify docstring so that it's mroe clear what this script does. The -n / --null command line option didn't actually do anything; change it so that it does. ! Sjoerd Mullender 20/08/2004 imapfilter: Fix the regular expression to match the Message-ID header by stopping on newline. ! Skip Montanaro 18/08/2004 tte.py: Seems better to try and alternate ham/spam scoring instead of scoring all the hams in a batch and all the spams. ! Kenny Pitt 11/08/2004 First pass at moving help text out of the Python source and into the ui.html file. ! Tony Meyer 10/08/2004 Warn people using spam_cutoff/ham_cutoff values of 0.5 or lower/higher. Also warn them if the ham_cutoff is higher than the spam_cutoff. ! Tony Meyer 09/08/2004 Change [Classifier] x-use_bigrams to a normal, not experimental option. ! Tony Meyer 06/08/2004 imapfilter: isinstance check is wrong, so will never be true, so literals in the folder list will never be handled correctly. ! Tony Meyer 05/08/2004 Remove all traces of the experimental imbalance option. ! Tony Meyer 05/08/2004 Remove support code for two deprecated options: [Tokenizer] x-extract_dow and [Tokenizer] x-generate_time_buckets. ! Tony Meyer 04/08/2004 Add basic unit tests for imapfilter. ! Tony Meyer 04/08/2004 Factor out X-Spambayes-Exception header code to message.py, and get imapfilter to use this. ! Tony Meyer 04/08/2004 imapfilter: Keep going if just one folder is bad (training/filtering). ! Tony Meyer 04/08/2004 imapfilter: Switch to using the Message-ID header id as our id, unless one can't be found, in which case we use our one. ! Tony Meyer 04/08/2004 imapfilter: General sb_imapfilter tidy-up. ! Tony Meyer 04/08/2004 imapfilter: Remove the layers of attempting to fetch. ! Tony Meyer 04/08/2004 imapfilter: Change the way the get_substance method works (renaming it in the process). ! Tony Meyer 04/08/2004 imapfilter: Be less restrictive about the error returned when logging in fails. ! Sjoerd Mullender 03/08/2004 message.py: Don't round-trip the message being tokenized to a string. ! Tony Meyer 03/08/2004 Implement [ 909088 ] remove STLS pop3 capability ! Skip Montanaro 26/07/2004 tte.py: Generalize the spam:ham ratio flag to include the ham value instead of having it be implicitly 1. ! Skip Montanaro 25/07/2004 tte.py: Add --ratio=N flag to allow the user to adjust the ratio of spam to ham. ! Tony Meyer 23/07/2004 For proxy handler for version checking, the proxy port needs to be an integer, not a string. ! Tony Meyer 22/07/2004 pop3proxy_tray: Do what the Outlook plug-in does and give the user the "string" version of the version number. ! Tony Meyer 22/07/2004 Fix proxy handler for checking for latest version when username/password isn't given. Fix for Python 2.4 ! Kenny Pitt 22/07/2004 Improve display names for "allow_remote_connections" options to be less confusing. Tony Meyer 19/07/2004 Fix [ 990700 ] Changes to asyncore in Python 2.4 break ServerLineReader Kenny Pitt 17/07/2004 Add an "Empty Spam Folder" option to the plug-in dropdown menu. (Patch [831941]) From anadelonbrin at users.sourceforge.net Wed Oct 13 04:42:06 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Wed Oct 13 04:42:10 2004 Subject: [Spambayes-checkins] spambayes/spambayes ImapUI.py,1.38,1.39 Message-ID: Update of /cvsroot/spambayes/spambayes/spambayes In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv9650/spambayes Modified Files: ImapUI.py Log Message: Implement [ 1039057 ] Diffs for IMAP login problems... But in a different way. This changes imapfilter so that if there is a problem connecting to the server the script doesn't just terminate, it prints out a message and keeps going. This means that if it's a temporary problem and imapfilter is running continuously, it should work the next time. If it's only running once, then it'll end as usual. Also corresponding change to ImapUI in case there are problems connecting when using the web interface to set folder parameters (this is better than the old way, which would terminate the script, rather than present an error in the web interface). Index: ImapUI.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/spambayes/ImapUI.py,v retrieving revision 1.38 retrieving revision 1.39 diff -C2 -d -r1.38 -r1.39 *** ImapUI.py 12 Oct 2004 23:44:00 -0000 1.38 --- ImapUI.py 13 Oct 2004 02:42:04 -0000 1.39 *************** *** 204,207 **** --- 204,214 ---- port = 143 self.imap = self.imap_session_class(server, port) + if not self.imap.connected: + # Failed to connect. + content = self._buildBox("Error", None, + "Please check server/port details.") + self.write(content) + self._writePostamble() + return if self.imap is None: content = self._buildBox("Error", None, From anadelonbrin at users.sourceforge.net Wed Oct 13 04:42:06 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Wed Oct 13 04:42:10 2004 Subject: [Spambayes-checkins] spambayes/scripts sb_imapfilter.py,1.40,1.41 Message-ID: Update of /cvsroot/spambayes/spambayes/scripts In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv9650/scripts Modified Files: sb_imapfilter.py Log Message: Implement [ 1039057 ] Diffs for IMAP login problems... But in a different way. This changes imapfilter so that if there is a problem connecting to the server the script doesn't just terminate, it prints out a message and keeps going. This means that if it's a temporary problem and imapfilter is running continuously, it should work the next time. If it's only running once, then it'll end as usual. Also corresponding change to ImapUI in case there are problems connecting when using the web interface to set folder parameters (this is better than the old way, which would terminate the script, rather than present an error in the web interface). Index: sb_imapfilter.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/scripts/sb_imapfilter.py,v retrieving revision 1.40 retrieving revision 1.41 diff -C2 -d -r1.40 -r1.41 *** sb_imapfilter.py 1 Oct 2004 20:30:52 -0000 1.40 --- sb_imapfilter.py 13 Oct 2004 02:42:04 -0000 1.41 *************** *** 141,147 **** BaseIMAP.__init__(self, server, port) except (BaseIMAP.error, socket.gaierror, socket.error): ! print "Cannot connect to server, please check your server " \ ! "and port settings." ! sys.exit() self.debug = debug self.do_expunge = do_expunge --- 141,148 ---- BaseIMAP.__init__(self, server, port) except (BaseIMAP.error, socket.gaierror, socket.error): ! print "Cannot connect to server %s on port %s" % (server, port) ! self.connected = False ! else: ! self.connected = True self.debug = debug self.do_expunge = do_expunge *************** *** 156,164 **** def login(self, username, pwd): """Log in to the IMAP server, catching invalid username/password.""" try: BaseIMAP.login(self, username, pwd) # superclass login except BaseIMAP.error, e: ! print "There was an error logging in to the IMAP server." ! print "The username and/or password may be incorrect." sys.exit() self.logged_in = True --- 157,168 ---- def login(self, username, pwd): """Log in to the IMAP server, catching invalid username/password.""" + assert self.connected, "Must be connected before logging in." try: BaseIMAP.login(self, username, pwd) # superclass login except BaseIMAP.error, e: ! msg = "There was an error logging in to the IMAP server." \ ! " The username (%s) and/or password may " \ ! "be incorrect." % (username,) ! print msg sys.exit() self.logged_in = True *************** *** 998,1014 **** while True: imap = IMAPSession(server, port, imapDebug, doExpunge) ! imap.login(username, pwd) ! imap_filter.imap_server = imap ! if doTrain: ! if options["globals", "verbose"]: ! print "Training" ! imap_filter.Train() ! if doClassify: ! if options["globals", "verbose"]: ! print "Classifying" ! imap_filter.Filter() ! imap.logout() if sleepTime: --- 1002,1025 ---- while True: imap = IMAPSession(server, port, imapDebug, doExpunge) ! if imap.connected: ! imap.login(username, pwd) ! imap_filter.imap_server = imap ! if doTrain: ! if options["globals", "verbose"]: ! print "Training" ! imap_filter.Train() ! if doClassify: ! if options["globals", "verbose"]: ! print "Classifying" ! imap_filter.Filter() ! imap.logout() ! else: ! # Failed to connect. This may be a temporary problem, ! # so just continue on and try again. If we are only ! # running once we will end, otherwise we'll try again ! # in sleepTime seconds. ! pass if sleepTime: From anadelonbrin at users.sourceforge.net Wed Oct 13 07:57:43 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Wed Oct 13 07:57:48 2004 Subject: [Spambayes-checkins] spambayes/spambayes/test test_sb_imapfilter.py, 1.2, 1.3 Message-ID: Update of /cvsroot/spambayes/spambayes/spambayes/test In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv10571/spambayes/test Modified Files: test_sb_imapfilter.py Log Message: Add more tests, including shells for the minimum set of tests that I want done. Add ability for fake IMAP server to fail on any given command (to test failure). Add malformed message to test choking on a message. Note that this will fail with Python 2.4, because I can't find a message that generates a defect. Contributions would be appreciated! Add some print statements to make the output cleaner and clearer. Use assertEquals rather than assert_(a == b) where appropriate. Index: test_sb_imapfilter.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/spambayes/test/test_sb_imapfilter.py,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** test_sb_imapfilter.py 9 Aug 2004 07:45:02 -0000 1.2 --- test_sb_imapfilter.py 13 Oct 2004 05:57:41 -0000 1.3 *************** *** 1,8 **** # Test sb_imapfilter script. - # At the moment, the script needs to be provided with an IMAP server to - # use for the testing. It would be nice if we provided a dummy server - # like test_sb-server.py does for POP, but this will do for the moment. - import sys import time --- 1,4 ---- *************** *** 28,34 **** # Key is UID. IMAP_MESSAGES = {101 : """Subject: Test\r\n\r\nBody test.""", ! 102 : """Subject: Test2\r\n\r\nAnother body test."""} # Map of ID -> UID ! IMAP_UIDS = {1 : 101, 2: 102} class TestListener(Dibbler.Listener): --- 24,63 ---- # Key is UID. IMAP_MESSAGES = {101 : """Subject: Test\r\n\r\nBody test.""", ! 102 : """Subject: Test2\r\n\r\nAnother body test.""", ! # 103 is taken from Anthony's email torture test ! # (the test_zero-length-boundary file). ! 103 : """Received: from noisy-2-82-67-182-141.fbx.proxad.net(82.67.182.141) ! via SMTP by mx1.example.com, id smtpdAAAzMayUR; Tue Apr 27 18:56:48 2004 ! Return-Path: " Freeman" ! Received: from rly-xn05.mx.aol.com (rly-xn05.mail.aol.com [172.20.83.138]) by air-xn02.mail.aol.com (v98.10) with ESMTP id MAILINXN22-6504043449c151; Tue, 27 Apr 2004 16:57:46 -0300 ! Received: from 132.16.224.107 by 82.67.182.141; Tue, 27 Apr 2004 14:54:46 -0500 ! From: " Gilliam" <.@doramail.com> ! To: To: user@example.com ! Subject: Your Source For Online Prescriptions....Soma-Watson..VALIUM-Roche . ! Date: Wed, 28 Apr 2004 00:52:46 +0500 ! Mime-Version: 1.0 ! Content-Type: multipart/alternative; ! boundary="" ! X-Mailer: AOL 7.0 for Windows US sub 118 ! X-AOL-IP: 114.204.176.98 ! X-AOL-SCOLL-SCORE: 1:XXX:XX ! X-AOL-SCOLL-URL_COUNT: 2 ! Message-ID: <@XLUPSYGSHLBAPN@runbox.com> ! ! -- ! Content-Type: text/html; ! charset="iso-8859-1" ! Content-Transfer-Encoding: quoted-printable ! ! ENTER HERE to ! ORDER MEDS Online, such as XANAX..VALIUM..SOMA..Much MORE SHIPPED ! OVERNIGHT,to US and INTERNATIONAL ! ! --- ! ! """, ! } # Map of ID -> UID ! IMAP_UIDS = {1 : 101, 2: 102, 3:103} class TestListener(Dibbler.Listener): *************** *** 40,43 **** --- 69,74 ---- + # If true, the next command will fail, whatever it is. + FAIL_NEXT = False class TestIMAP4Server(Dibbler.BrighterAsyncChat): """Minimal IMAP4 server, for testing purposes. Accepts a limited *************** *** 69,73 **** --- 100,110 ---- def found_terminator(self): """Asynchat override.""" + global FAIL_NEXT id, command = self.request.split(None, 1) + + if FAIL_NEXT: + FAIL_NEXT = False + self.push("%s NO Was told to fail.\r\n" % (id,)) + if ' ' in command: command, args = command.split(None, 1) *************** *** 182,185 **** --- 219,227 ---- class IMAPSessionTest(BaseIMAPFilterTest): + def testConnection(self): + # Connection is made in setup, just need to check + # that it worked. + self.assert_(self.imap.connected) + def testGoodLogin(self): self.imap.login(IMAP_USERNAME, IMAP_PASSWORD) *************** *** 187,190 **** --- 229,233 ---- def testBadLogin(self): + print "\nYou should see a message indicating that login failed." self.assertRaises(SystemExit, self.imap.login, IMAP_USERNAME, "wrong password") *************** *** 209,213 **** self.imap.SelectFolder("Inbox") response = self.imap.response('OK') ! self.assert_(response[0] == "OK") self.assert_(response[1] != [None]) --- 252,256 ---- self.imap.SelectFolder("Inbox") response = self.imap.response('OK') ! self.assertEquals(response[0], "OK") self.assert_(response[1] != [None]) *************** *** 215,230 **** self.imap.SelectFolder("Inbox") response = self.imap.response('OK') ! self.assert_(response[0] == "OK") ! self.assert_(response[1] == [None]) def test_folder_list(self): # This test will fail if testGoodLogin fails. self.imap.login(IMAP_USERNAME, IMAP_PASSWORD) - - # If we had more control over what the IMAP server returned - # (say we had our own one, as suggested above), then we could - # test returning literals, getting an error, and a bad literal, - # but since we don't, just do a simple test for now. folders = self.imap.folder_list() correct = IMAP_FOLDER_LIST[:] --- 258,271 ---- self.imap.SelectFolder("Inbox") response = self.imap.response('OK') ! self.assertEquals(response[0], "OK") ! self.assertEquals(response[1], [None]) def test_folder_list(self): + global FAIL_NEXT + # This test will fail if testGoodLogin fails. self.imap.login(IMAP_USERNAME, IMAP_PASSWORD) + # Everything working. folders = self.imap.folder_list() correct = IMAP_FOLDER_LIST[:] *************** *** 232,235 **** --- 273,285 ---- self.assertEqual(folders, correct) + # Bad command. + print "\nYou should see a message indicating that getting the " \ + "folder list failed." + FAIL_NEXT = True + self.assertEqual(self.imap.folder_list(), []) + + # Literals in response. + # XXX TO DO! + def test_extract_fetch_data(self): response = "bad response" *************** *** 325,331 **** self.msg.folder = IMAPFolder("Inbox", self.msg.imap_server) - # When we have a dummy server, check for MemoryError here. - # And also an unparseable message (for Python < 2.4). - new_msg = self.msg.get_full_message() self.assertEqual(new_msg.folder, self.msg.folder) --- 375,378 ---- *************** *** 343,346 **** --- 390,473 ---- self.assert_(new_msg is new_msg2) + def test_get_bad_message(self): + self.msg.id = "unittest" + self.msg.imap_server.login(IMAP_USERNAME, IMAP_PASSWORD) + self.msg.imap_server.select() + self.msg.uid = 103 # id of malformed message in dummy server + self.msg.folder = IMAPFolder("Inbox", self.msg.imap_server) + print "\nWith email package versions less than 3.0, you should " \ + "see an error parsing the message." + new_msg = self.msg.get_full_message() + # With Python < 2.4 (i.e. email < 3.0) we get an exception + # header. With more recent versions, we get a defects attribute. + # XXX I can't find a message that generates a defect! Until + # message 103 is replaced with one that does, this will fail with + # Python 2.4/email 3.0. + has_header = "X-Spambayes-Exception: " in new_msg.as_string() + has_defect = hasattr(new_msg, "defects") and len(new_msg.defects) > 0 + self.assert_(has_header or has_defect) + + def test_get_memory_error_message(self): + # XXX Figure out a way to trigger a memory error - but not in + # the fake IMAP server, in imaplib, or our IMAP class. + pass + + def test_Save(self): + # XXX To-do + pass + + + class IMAPFolderTest(BaseIMAPFilterTest): + def setUp(self): + BaseIMAPFilterTest.setUp(self) + self.folder = IMAPFolder("testfolder", self.imap) + + def test_cmp(self): + folder2 = IMAPFolder("testfolder", self.imap) + folder3 = IMAPFolder("testfolder2", self.imap) + self.assertEqual(self.folder, folder2) + self.assertNotEqual(self.folder, folder3) + + def test_iter(self): + # XXX To-do + pass + def test_keys(self): + # XXX To-do + pass + def test_getitem(self): + # XXX To-do + pass + + def test_generate_id(self): + print "\nThis test takes slightly over a second." + id1 = self.folder._generate_id() + id2 = self.folder._generate_id() + id3 = self.folder._generate_id() + # Need to wait at least one clock tick. + time.sleep(1) + id4 = self.folder._generate_id() + self.assertEqual(id2, id1 + "-2") + self.assertEqual(id3, id1 + "-3") + self.assertNotEqual(id1, id4) + self.assertNotEqual(id2, id4) + self.assertNotEqual(id3, id4) + self.assert_('-' not in id4) + + def test_Train(self): + # XXX To-do + pass + def test_Filter(self): + # XXX To-do + pass + + + class IMAPFilterTest(BaseIMAPFilterTest): + def test_Train(self): + # XXX To-do + pass + def test_Filter(self): + # XXX To-do + pass + def suite(): *************** *** 348,351 **** --- 475,480 ---- for cls in (IMAPSessionTest, IMAPMessageTest, + IMAPFolderTest, + IMAPFilterTest, ): suite.addTest(unittest.makeSuite(cls)) From anadelonbrin at users.sourceforge.net Thu Oct 14 06:01:20 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Thu Oct 14 06:01:24 2004 Subject: [Spambayes-checkins] spambayes/spambayes/test test_sb_imapfilter.py, 1.3, 1.4 Message-ID: Update of /cvsroot/spambayes/spambayes/spambayes/test In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv7854/spambayes/test Modified Files: test_sb_imapfilter.py Log Message: Add some more limited capability to the fake server: store, append, search, fetch rfc822.header and fetch flags internaldate. Finish off most of the IMAPFolder tests (test_iter, test_keys, test_getitem), and do the setup for IMAPFilterTest. Index: test_sb_imapfilter.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/spambayes/test/test_sb_imapfilter.py,v retrieving revision 1.3 retrieving revision 1.4 diff -C2 -d -r1.3 -r1.4 *** test_sb_imapfilter.py 13 Oct 2004 05:57:41 -0000 1.3 --- test_sb_imapfilter.py 14 Oct 2004 04:01:10 -0000 1.4 *************** *** 15,31 **** from spambayes import Dibbler from spambayes.Options import options from sb_imapfilter import BadIMAPResponseError ! from sb_imapfilter import IMAPSession, IMAPMessage, IMAPFolder IMAP_PORT = 8143 IMAP_USERNAME = "testu" IMAP_PASSWORD = "testp" ! IMAP_FOLDER_LIST = ["INBOX", "unsure", "ham_to_train", "spam"] # Key is UID. ! IMAP_MESSAGES = {101 : """Subject: Test\r\n\r\nBody test.""", ! 102 : """Subject: Test2\r\n\r\nAnother body test.""", ! # 103 is taken from Anthony's email torture test ! # (the test_zero-length-boundary file). ! 103 : """Received: from noisy-2-82-67-182-141.fbx.proxad.net(82.67.182.141) via SMTP by mx1.example.com, id smtpdAAAzMayUR; Tue Apr 27 18:56:48 2004 Return-Path: " Freeman" --- 15,50 ---- from spambayes import Dibbler from spambayes.Options import options + from spambayes.classifier import Classifer from sb_imapfilter import BadIMAPResponseError ! from spambayes.message import message_from_string ! from sb_imapfilter import IMAPSession, IMAPMessage, IMAPFolder, IMAPFilter IMAP_PORT = 8143 IMAP_USERNAME = "testu" IMAP_PASSWORD = "testp" ! IMAP_FOLDER_LIST = ["INBOX", "unsure", "ham_to_train", "spam", ! "spam_to_train"] ! # Must be different. ! SB_ID_1 = "test@spambayes.invalid" ! SB_ID_2 = "14102004" # Key is UID. ! IMAP_MESSAGES = { ! # 101 should be valid and have a MessageID header, but no ! # X-Spambayes-MessageID header. ! 101 : """Subject: Test\r ! Message-ID: <%s>\r ! \r ! Body test.""" % (SB_ID_1,), ! # 102 should be valid and have both a MessageID header and a ! # X-Spambayes-MessageID header. ! 102 : """Subject: Test2\r ! Message-ID: <%s>\r ! %s: %s\r ! \r ! Another body test.""" % (SB_ID_1, options["Headers", "mailid_header_name"], ! SB_ID_2), ! # 103 is taken from Anthony's email torture test (the ! # test_zero-length-boundary file). ! 103 : """Received: from noisy-2-82-67-182-141.fbx.proxad.net(82.67.182.141) via SMTP by mx1.example.com, id smtpdAAAzMayUR; Tue Apr 27 18:56:48 2004 Return-Path: " Freeman" *************** *** 57,63 **** """, ! } # Map of ID -> UID ! IMAP_UIDS = {1 : 101, 2: 102, 3:103} class TestListener(Dibbler.Listener): --- 76,90 ---- """, ! # 104 should be valid and have neither a MessageID header nor a ! # X-Spambayes-MessageID header. ! 104 : """Subject: Test2\r ! \r ! Yet another body test.""", ! } # Map of ID -> UID ! IMAP_UIDS = {1 : 101, 2: 102, 3:103, 4:104} ! ! # Messages that are UNDELETED ! UNDELETED_IDS = (1,2) class TestListener(Dibbler.Listener): *************** *** 88,104 **** 'SELECT' : self.onSelect, 'FETCH' : self.onFetch, 'UID' : self.onUID, } self.push("* OK [CAPABILITY IMAP4REV1 AUTH=LOGIN] " \ "localhost IMAP4rev1\r\n") self.request = '' def collect_incoming_data(self, data): """Asynchat override.""" ! self.request = self.request + data def found_terminator(self): """Asynchat override.""" global FAIL_NEXT id, command = self.request.split(None, 1) --- 115,149 ---- 'SELECT' : self.onSelect, 'FETCH' : self.onFetch, + 'SEARCH' : self.onSearch, 'UID' : self.onUID, + 'APPEND' : self.onAppend, + 'STORE' : self.onStore, } self.push("* OK [CAPABILITY IMAP4REV1 AUTH=LOGIN] " \ "localhost IMAP4rev1\r\n") self.request = '' + self.next_id = 0 + self.in_literal = (0, None) def collect_incoming_data(self, data): """Asynchat override.""" ! if self.in_literal[0] > 0: ! # Also add the line breaks. ! self.request = "%s\r\n%s" % (self.request, data) ! else: ! self.request = self.request + data def found_terminator(self): """Asynchat override.""" global FAIL_NEXT + + if self.in_literal[0] > 0: + if len(self.request) >= self.in_literal[0]: + self.push(self.in_literal[1](self.request, + *self.in_literal[2])) + self.in_literal = (0, None) + self.request = '' + return + id, command = self.request.split(None, 1) *************** *** 147,150 **** --- 192,199 ---- (base[2:], base.join(IMAP_FOLDER_LIST), id) + def onStore(self, id, command, args, uid=False): + # We ignore flags. + return "%s OK STORE completed\r\n" % (id,) + def onSelect(self, id, command, args, uid=False): exists = "* %d EXISTS" % (len(IMAP_MESSAGES),) *************** *** 159,162 **** --- 208,256 ---- flags, perm_flags, complete]),) + def onAppend(self, id, command, args, uid=False): + # Only stores for this session. + folder, args = args.split(None, 1) + # We ignore the folder. + if ')' in args: + flags, args = args.split(')', 1) + flags = flags[1:] + # We ignore the flags. + unused, date, args = args.split('"', 2) + # We ignore the date. + if '{' in args: + # A literal. + size = int(args[2:-1]) + self.in_literal = (size, self.appendLiteral, (id,)) + return "+ Ready for argument\r\n" + # Strip off the space at the front. + return self.appendLiteral(args[1:], id) + + def appendLiteral(self, message, command_id): + while True: + id = self.next_id + self.next_id += 1 + if id not in IMAP_MESSAGES: + break + IMAP_MESSAGES[id] = message + return "* APPEND %s\r\n%s OK APPEND succeeded\r\n" % \ + (id, command_id) + + def onSearch(self, id, command, args, uid=False): + args = args.upper() + results = () + if "UNDELETED" in args: + for msg_id in UNDELETED_IDS: + if uid: + results += (IMAP_UIDS[msg_id],) + else: + results += (msg_id,) + if uid: + command_string = "UID " + command + else: + command_string = command + return "%s\r\n%s OK %s completed\r\n" % \ + ("* SEARCH " + ' '.join([str(r) for r in results]), id, + command_string) + def onFetch(self, id, command, args, uid=False): msg_nums, msg_parts = args.split(None, 1) *************** *** 182,185 **** --- 276,295 ---- (len(IMAP_MESSAGES[msg_uid])), IMAP_MESSAGES[msg_uid])) + if "RFC822.HEADER" in msg_parts: + for msg in msg_nums: + if uid: + msg_uid = int(msg) + else: + msg_uid = IMAP_UIDS[int(msg)] + msg_text = IMAP_MESSAGES[msg_uid] + headers, unused = msg_text.split('\r\n\r\n', 1) + response[msg].append(("FETCH (RFC822.HEADER {%s}" % + (len(headers),), headers)) + if "FLAGS INTERNALDATE" in msg_parts: + # We make up flags & dates. + for msg in msg_nums: + response[msg].append('FETCH (FLAGS (\Seen \Deleted) ' + 'INTERNALDATE "27-Jul-2004 13:1' + '1:56 +1200') for msg in msg_nums: try: *************** *** 200,204 **** actual_command, args = args.split(None, 1) handler = self.handlers.get(actual_command, self.onUnknown) ! return handler(id, command, args, uid=True) def onUnknown(self, id, command, args, uid=False): --- 310,314 ---- actual_command, args = args.split(None, 1) handler = self.handlers.get(actual_command, self.onUnknown) ! return handler(id, actual_command, args, uid=True) def onUnknown(self, id, command, args, uid=False): *************** *** 421,424 **** --- 531,535 ---- def setUp(self): BaseIMAPFilterTest.setUp(self) + self.imap.login(IMAP_USERNAME, IMAP_PASSWORD) self.folder = IMAPFolder("testfolder", self.imap) *************** *** 430,441 **** def test_iter(self): ! # XXX To-do ! pass def test_keys(self): ! # XXX To-do ! pass ! def test_getitem(self): ! # XXX To-do ! pass def test_generate_id(self): --- 541,591 ---- def test_iter(self): ! keys = self.folder.keys() ! for msg in self.folder: ! msg = msg.get_full_message() ! msg_correct = message_from_string(IMAP_MESSAGES[int(keys[0])]) ! id_header_name = options["Headers", "mailid_header_name"] ! if msg_correct[id_header_name] is None: ! msg_correct[id_header_name] = msg.id ! self.assertEqual(msg.as_string(), msg_correct.as_string()) ! keys = keys[1:] ! def test_keys(self): ! keys = self.folder.keys() ! # We get back UIDs, not IDs, so convert to check. ! correct_keys = [str(IMAP_UIDS[id]) for id in UNDELETED_IDS] ! self.assertEqual(keys, correct_keys) ! ! def test_getitem_new_style(self): ! # 101 already has a suitable (new style) id, so it should ! # not be recreated. ! id_header_name = options["Headers", "mailid_header_name"] ! msg1 = self.folder[101] ! self.assertEqual(msg1.id, SB_ID_1) ! msg1 = msg1.get_full_message() ! msg1_correct = message_from_string(IMAP_MESSAGES[101]) ! self.assertNotEqual(msg1[id_header_name], None) ! msg1_correct[id_header_name] = SB_ID_1 ! self.assertEqual(msg1.as_string(), msg1_correct.as_string()) ! ! def test_getitem_old_style(self): ! # 102 already has a suitable (old style) id, so it should ! # not be recreated. We should be sure to use the old id, ! # rather than the new one, too, for backwards compatibility. ! id_header_name = options["Headers", "mailid_header_name"] ! msg2 = self.folder[102] ! self.assertEqual(msg2.id, SB_ID_2) ! msg2 = msg2.get_full_message() ! self.assertNotEqual(msg2[id_header_name], None) ! self.assertEqual(msg2.as_string(), IMAP_MESSAGES[102]) ! ! def test_getitem_new_id(self): ! # 104 doesn't have an id, so should be recreated with one. ! id_header_name = options["Headers", "mailid_header_name"] ! msg3 = self.folder[104] ! self.assertNotEqual(msg3[id_header_name], None) ! msg_correct = message_from_string(IMAP_MESSAGES[104]) ! msg_correct[id_header_name] = msg3.id ! self.assertEqual(msg3.as_string(), msg_correct.as_string()) def test_generate_id(self): *************** *** 463,466 **** --- 613,624 ---- class IMAPFilterTest(BaseIMAPFilterTest): + def setUp(self): + BaseIMAPFilterTest.setUp(self) + self.imap.login(IMAP_USERNAME, IMAP_PASSWORD) + classifier = Classifier() + self.filter = IMAPFilter(classifier) + options["imap", "ham_train_folders"] = ("ham_to_train",) + options["imap", "spam_train_folders"] = ("spam_to_train",) + def test_Train(self): # XXX To-do From anadelonbrin at users.sourceforge.net Fri Oct 15 01:36:15 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Fri Oct 15 01:36:19 2004 Subject: [Spambayes-checkins] spambayes/Outlook2000 addin.py, 1.131, 1.132 manager.py, 1.96, 1.97 oastats.py, 1.3, 1.4 Message-ID: Update of /cvsroot/spambayes/spambayes/Outlook2000 In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv12187/Outlook2000 Modified Files: addin.py manager.py oastats.py Log Message: Log the folder's name rather than id for OnItemAdd events. Print out a nicer version of the date/time the log was created. Add persistent statistics. These are saved in a (very) little pickle in the data directory. In the Advanced tab the persistent stats are shown; in the log both session only and total stats are shown. Index: addin.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/Outlook2000/addin.py,v retrieving revision 1.131 retrieving revision 1.132 diff -C2 -d -r1.131 -r1.132 *** addin.py 1 Oct 2004 14:31:34 -0000 1.131 --- addin.py 14 Oct 2004 23:36:12 -0000 1.132 *************** *** 380,384 **** # Callback from Outlook - locale may have changed. locale.setlocale(locale.LC_NUMERIC, "C") # see locale comments above ! self.manager.LogDebug(2, "OnItemAdd event for folder", self, "with item", item.Subject.encode("mbcs", "ignore")) # Due to the way our "missed message" indicator works, we do --- 380,384 ---- # Callback from Outlook - locale may have changed. locale.setlocale(locale.LC_NUMERIC, "C") # see locale comments above ! self.manager.LogDebug(2, "OnItemAdd event for folder", self.name, "with item", item.Subject.encode("mbcs", "ignore")) # Due to the way our "missed message" indicator works, we do *************** *** 1242,1249 **** (major, minor, spack, ver_str) print "using Python", sys.version ! from time import localtime ! ltime = localtime() ! print "Log created %s-%s-%s" % \ ! (ltime[0], ltime[1], ltime[2]) self.explorers_events = None # create at OnStartupComplete --- 1242,1247 ---- (major, minor, spack, ver_str) print "using Python", sys.version ! from time import asctime, localtime ! print "Log created", asctime(localtime()) self.explorers_events = None # create at OnStartupComplete *************** *** 1458,1463 **** # it (ie, the dialog) self.manager.Save() ! # Report some simple stats. print "\r\n".join(self.manager.stats.GetStats()) self.manager.Close() self.manager = None --- 1456,1466 ---- # it (ie, the dialog) self.manager.Save() ! # Report some simple stats, for session, and for total. ! print "Session:" ! print "\r\n".join(self.manager.stats.GetStats(True)) ! print "Total:" print "\r\n".join(self.manager.stats.GetStats()) + # Save stats. + self.manager.stats.Store() self.manager.Close() self.manager = None Index: manager.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/Outlook2000/manager.py,v retrieving revision 1.96 retrieving revision 1.97 diff -C2 -d -r1.96 -r1.97 *** manager.py 8 Feb 2004 22:29:45 -0000 1.96 --- manager.py 14 Oct 2004 23:36:12 -0000 1.97 *************** *** 404,408 **** self.classifier_data = ClassifierData(db_manager, self) self.LoadBayes() ! self.stats = oastats.Stats(self.config) # "old" bayes functions - new code should use "classifier_data" directly --- 404,408 ---- self.classifier_data = ClassifierData(db_manager, self) self.LoadBayes() ! self.stats = oastats.Stats(self.config, self.data_directory) # "old" bayes functions - new code should use "classifier_data" directly Index: oastats.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/Outlook2000/oastats.py,v retrieving revision 1.3 retrieving revision 1.4 diff -C2 -d -r1.3 -r1.4 *** oastats.py 16 Dec 2003 05:06:33 -0000 1.3 --- oastats.py 14 Oct 2004 23:36:12 -0000 1.4 *************** *** 1,7 **** # oastats.py - Outlook Addin Stats class Stats: ! def __init__(self, config): self.config = config self.Reset() def Reset(self): --- 1,23 ---- # oastats.py - Outlook Addin Stats + import os + import pickle + + STATS_FILENAME = "performance_statistics_database.pik" + class Stats: ! def __init__(self, config, data_directory): self.config = config + self.stored_statistics_fn = os.path.join(data_directory, + STATS_FILENAME) + if os.path.exists(self.stored_statistics_fn): + self.Load() + else: + # Reset totals + self.totals = {} + for stat in ["num_ham", "num_spam", "num_unsure", + "num_deleted_spam", "num_deleted_spam_fn", + "num_recovered_good", "num_recovered_good_fp",]: + self.totals[stat] = 0 self.Reset() def Reset(self): *************** *** 9,12 **** --- 25,44 ---- self.num_deleted_spam = self.num_deleted_spam_fn = 0 self.num_recovered_good = self.num_recovered_good_fp = 0 + def Load(self): + store = open(self.stored_statistics_fn, 'rb') + self.totals = pickle.load(store) + store.close() + def Store(self): + # Update totals, and save that. + for stat in ["num_ham", "num_spam", "num_unsure", + "num_deleted_spam", "num_deleted_spam_fn", + "num_recovered_good", "num_recovered_good_fp",]: + self.totals[stat] += getattr(self, stat) + store = open(self.stored_statistics_fn, 'wb') + pickle.dump(self.totals, store) + store.close() + # Reset, or the reporting for the remainder of this session will be + # incorrect. + self.Reset() def RecordClassification(self, score): score *= 100 # same units as our config values. *************** *** 31,51 **** if score < self.config.filter.unsure_threshold: self.num_deleted_spam_fn += 1 ! def GetStats(self): num_seen = self.num_ham + self.num_spam + self.num_unsure if num_seen==0: return ["SpamBayes has processed zero messages"] chunks = [] push = chunks.append ! perc_ham = 100.0 * self.num_ham / num_seen ! perc_spam = 100.0 * self.num_spam / num_seen perc_unsure = 100.0 * self.num_unsure / num_seen ! format_dict = dict(perc_spam=perc_spam, perc_ham=perc_ham, ! perc_unsure=perc_unsure, num_seen = num_seen) ! format_dict.update(self.__dict__) push("SpamBayes has processed %(num_seen)d messages - " \ "%(num_ham)d (%(perc_ham).0f%%) good, " \ "%(num_spam)d (%(perc_spam).0f%%) spam " \ "and %(num_unsure)d (%(perc_unsure).0f%%) unsure" % format_dict) ! if self.num_recovered_good: push("%(num_recovered_good)d message(s) were manually " \ "classified as good (with %(num_recovered_good_fp)d " \ --- 63,120 ---- if score < self.config.filter.unsure_threshold: self.num_deleted_spam_fn += 1 ! def GetStats(self, session_only=False): ! """Return a description of the statistics. ! ! If session_only is True, then only a description of the statistics ! since we were last reset. Otherwise, lifetime statistics (i.e. ! those including the ones loaded). ! ! Users probably care most about persistent statistics, so present ! those by default. If session-only stats are desired, then a ! special call to here can be made. ! """ num_seen = self.num_ham + self.num_spam + self.num_unsure + if not session_only: + totals = self.totals + num_seen += (totals["num_ham"] + totals["num_spam"] + + totals["num_unsure"]) if num_seen==0: return ["SpamBayes has processed zero messages"] chunks = [] push = chunks.append ! if session_only: ! num_ham = self.num_ham ! num_spam = self.num_spam ! num_unsure = self.num_unsure ! num_recovered_good = self.num_recovered_good ! num_recovered_good_fp = self.num_recovered_good_fp ! num_deleted_spam = self.num_deleted_spam ! num_deleted_spam_fn = self.num_deleted_spam_fn ! else: ! num_ham = self.num_ham + self.totals["num_ham"] ! num_spam = self.num_spam + self.totals["num_spam"] ! num_unsure = self.num_unsure + self.totals["num_unsure"] ! num_recovered_good = self.num_recovered_good + \ ! self.totals["num_recovered_good"] ! num_recovered_good_fp = self.num_recovered_good_fp + \ ! self.totals["num_recovered_good_fp"] ! num_deleted_spam = self.num_deleted_spam + \ ! self.totals["num_deleted_spam"] ! num_deleted_spam_fn = self.num_deleted_spam_fn + \ ! self.totals["num_deleted_spam_fn"] ! perc_ham = 100.0 * num_ham / num_seen ! perc_spam = 100.0 * num_spam / num_seen perc_unsure = 100.0 * self.num_unsure / num_seen ! format_dict = locals().copy() ! del format_dict["self"] ! del format_dict["push"] ! del format_dict["chunks"] ! format_dict.update(dict(perc_spam=perc_spam, perc_ham=perc_ham, ! perc_unsure=perc_unsure, num_seen=num_seen)) push("SpamBayes has processed %(num_seen)d messages - " \ "%(num_ham)d (%(perc_ham).0f%%) good, " \ "%(num_spam)d (%(perc_spam).0f%%) spam " \ "and %(num_unsure)d (%(perc_unsure).0f%%) unsure" % format_dict) ! if num_recovered_good: push("%(num_recovered_good)d message(s) were manually " \ "classified as good (with %(num_recovered_good_fp)d " \ *************** *** 53,57 **** else: push("No messages were manually classified as good") ! if self.num_deleted_spam: push("%(num_deleted_spam)d message(s) were manually " \ "classified as spam (with %(num_deleted_spam_fn)d " \ --- 122,126 ---- else: push("No messages were manually classified as good") ! if num_deleted_spam: push("%(num_deleted_spam)d message(s) were manually " \ "classified as spam (with %(num_deleted_spam_fn)d " \ *************** *** 67,79 **** class Config: filter = FilterConfig() # processed zero ! s = Stats(Config()) print "\n".join(s.GetStats()) # No recovery ! s = Stats(Config()) s.RecordClassification(.2) print "\n".join(s.GetStats()) ! s = Stats(Config()) s.RecordClassification(.2) s.RecordClassification(.1) --- 136,149 ---- class Config: filter = FilterConfig() + data_directory = os.getcwd() # processed zero ! s = Stats(Config(), data_directory) print "\n".join(s.GetStats()) # No recovery ! s = Stats(Config(), data_directory) s.RecordClassification(.2) print "\n".join(s.GetStats()) ! s = Stats(Config(), data_directory) s.RecordClassification(.2) s.RecordClassification(.1) *************** *** 85,86 **** --- 155,164 ---- s.RecordManualClassification(False, 0.9) print "\n".join(s.GetStats()) + + # Store + # (this will leave an artifact in the cwd) + s.Store() + # Load + s = Stats(Config(), data_directory) + print "\n".join(s.GetStats()) + print "\n".join(s.GetStats(True)) From anadelonbrin at users.sourceforge.net Fri Oct 15 01:44:40 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Fri Oct 15 01:44:43 2004 Subject: [Spambayes-checkins] spambayes/Outlook2000 oastats.py, 1.4, 1.5 train.py, 1.36, 1.37 Message-ID: Update of /cvsroot/spambayes/spambayes/Outlook2000 In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv14100/Outlook2000 Modified Files: oastats.py train.py Log Message: If we rebuild the database from scratch, that would be a good time to reset the statistics, so do so. Index: oastats.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/Outlook2000/oastats.py,v retrieving revision 1.4 retrieving revision 1.5 diff -C2 -d -r1.4 -r1.5 *** oastats.py 14 Oct 2004 23:36:12 -0000 1.4 --- oastats.py 14 Oct 2004 23:44:36 -0000 1.5 *************** *** 14,23 **** self.Load() else: ! # Reset totals ! self.totals = {} ! for stat in ["num_ham", "num_spam", "num_unsure", ! "num_deleted_spam", "num_deleted_spam_fn", ! "num_recovered_good", "num_recovered_good_fp",]: ! self.totals[stat] = 0 self.Reset() def Reset(self): --- 14,18 ---- self.Load() else: ! self.ResetTotal() self.Reset() def Reset(self): *************** *** 25,28 **** --- 20,36 ---- self.num_deleted_spam = self.num_deleted_spam_fn = 0 self.num_recovered_good = self.num_recovered_good_fp = 0 + def ResetTotal(self, permanently=False): + self.totals = {} + for stat in ["num_ham", "num_spam", "num_unsure", + "num_deleted_spam", "num_deleted_spam_fn", + "num_recovered_good", "num_recovered_good_fp",]: + self.totals[stat] = 0 + if permanently: + # Also remove the file. + try: + os.remove(self.stored_statistics_fn) + except OSError: + # Maybe we had never saved it. + pass def Load(self): store = open(self.stored_statistics_fn, 'rb') Index: train.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/Outlook2000/train.py,v retrieving revision 1.36 retrieving revision 1.37 diff -C2 -d -r1.36 -r1.37 *** train.py 25 Jul 2004 23:43:33 -0000 1.36 --- train.py 14 Oct 2004 23:44:36 -0000 1.37 *************** *** 169,172 **** --- 169,178 ---- mgr.classifier_data.Adopt(classifier_data) classifier_data = mgr.classifier_data + # If we are rebuilding, then we reset the statistics, too. + # (But output them to the log for reference). + mgr.LogDebug(1, "Session:" + "\r\n".join(mgr.stats.GetStats(False)) + mgr.LogDebug(1, "Total:" + "\r\n".join(mgr.stats.GetStats()) + mgr.stats.Reset() + mgr.stats.ResetTotal(True) progress.tick() From anadelonbrin at users.sourceforge.net Fri Oct 15 04:04:57 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Fri Oct 15 04:05:01 2004 Subject: [Spambayes-checkins] spambayes/Outlook2000 train.py,1.37,1.38 Message-ID: Update of /cvsroot/spambayes/spambayes/Outlook2000 In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv10329/Outlook2000 Modified Files: train.py Log Message: Sorry - I missed the ending parentheses when copying this over, so checked it in with a syntax error. This should now work. Index: train.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/Outlook2000/train.py,v retrieving revision 1.37 retrieving revision 1.38 diff -C2 -d -r1.37 -r1.38 *** train.py 14 Oct 2004 23:44:36 -0000 1.37 --- train.py 15 Oct 2004 02:04:55 -0000 1.38 *************** *** 171,176 **** # If we are rebuilding, then we reset the statistics, too. # (But output them to the log for reference). ! mgr.LogDebug(1, "Session:" + "\r\n".join(mgr.stats.GetStats(False)) ! mgr.LogDebug(1, "Total:" + "\r\n".join(mgr.stats.GetStats()) mgr.stats.Reset() mgr.stats.ResetTotal(True) --- 171,176 ---- # If we are rebuilding, then we reset the statistics, too. # (But output them to the log for reference). ! mgr.LogDebug(1, "Session:" + "\r\n".join(mgr.stats.GetStats(False))) ! mgr.LogDebug(1, "Total:" + "\r\n".join(mgr.stats.GetStats())) mgr.stats.Reset() mgr.stats.ResetTotal(True) From anadelonbrin at users.sourceforge.net Fri Oct 15 07:33:56 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Fri Oct 15 07:33:58 2004 Subject: [Spambayes-checkins] spambayes/contrib tte.py,1.9,1.9.4.1 Message-ID: Update of /cvsroot/spambayes/spambayes/contrib In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv15715/contrib Modified Files: Tag: release_1_0-branch tte.py Log Message: Backport Skip's fix to make tte.py compatible with Python < 2.4 (adding a reversed() function) Index: tte.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/contrib/tte.py,v retrieving revision 1.9 retrieving revision 1.9.4.1 diff -C2 -d -r1.9 -r1.9.4.1 *** tte.py 28 Apr 2004 03:29:46 -0000 1.9 --- tte.py 15 Oct 2004 05:33:53 -0000 1.9.4.1 *************** *** 83,86 **** --- 83,94 ---- print >> sys.stderr, __doc__.strip() % globals() + try: + reversed + except NameError: + def reversed(seq): + seq = seq[:] + seq.reverse() + return iter(seq) + def train(store, ham, spam, maxmsgs, maxrounds, tdict, reverse, verbose): smisses = hmisses = round = 0 From anadelonbrin at users.sourceforge.net Fri Oct 15 07:38:23 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Fri Oct 15 07:38:26 2004 Subject: [Spambayes-checkins] spambayes/windows pop3proxy_tray.py, 1.20, 1.20.4.1 Message-ID: Update of /cvsroot/spambayes/spambayes/windows In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv16393/windows Modified Files: Tag: release_1_0-branch pop3proxy_tray.py Log Message: Backport spelling mistake fix & line wrap. Index: pop3proxy_tray.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/windows/pop3proxy_tray.py,v retrieving revision 1.20 retrieving revision 1.20.4.1 diff -C2 -d -r1.20 -r1.20.4.1 *** pop3proxy_tray.py 23 Dec 2003 03:12:40 -0000 1.20 --- pop3proxy_tray.py 15 Oct 2004 05:38:04 -0000 1.20.4.1 *************** *** 190,194 **** self.tip = None if self.use_service and not self.IsServiceAvailable(): ! print "Service not availible. Using thread." self.use_service = False --- 190,194 ---- self.tip = None if self.use_service and not self.IsServiceAvailable(): ! print "Service not available. Using thread." self.use_service = False *************** *** 538,542 **** return ! self.ShowMessage("Current version is %s, latest is %s." % (cur_ver_num, latest_ver_num)) if latest_ver_num > cur_ver_num: url = get_version_string(app_name, "Download Page", version_dict=latest) --- 538,543 ---- return ! self.ShowMessage("Current version is %s, latest is %s." % \ ! (cur_ver_string, latest_ver_string)) if latest_ver_num > cur_ver_num: url = get_version_string(app_name, "Download Page", version_dict=latest) From anadelonbrin at users.sourceforge.net Fri Oct 15 07:43:56 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Fri Oct 15 07:43:58 2004 Subject: [Spambayes-checkins] spambayes/scripts sb_upload.py,1.2,1.2.4.1 Message-ID: Update of /cvsroot/spambayes/spambayes/scripts In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv17317/scripts Modified Files: Tag: release_1_0-branch sb_upload.py Log Message: Backport docstring fix and --null fix. Index: sb_upload.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/scripts/sb_upload.py,v retrieving revision 1.2 retrieving revision 1.2.4.1 diff -C2 -d -r1.2 -r1.2.4.1 *** sb_upload.py 15 Jan 2004 03:39:11 -0000 1.2 --- sb_upload.py 15 Oct 2004 05:43:53 -0000 1.2.4.1 *************** *** 3,7 **** """ Read a message or a mailbox file on standard input, upload it to a ! web browser and write it to standard output. usage: %(progname)s [-h] [-n] [-s server] [-p port] [-r N] --- 3,12 ---- """ Read a message or a mailbox file on standard input, upload it to a ! web server and write it to standard output. ! ! By default, this sends the message to the SpamBayes sb_server web ! interface, which will save the message in the 'unknown' cache, ready ! for you to classify it. It does not do any training, just saves it ! ready for you to classify. usage: %(progname)s [-h] [-n] [-s server] [-p port] [-r N] *************** *** 129,137 **** data = sys.stdin.read() ! sys.stdout.write(data) if random.random() < prob: try: post_multipart("%s:%d"%(server,port), "/upload", [], ! [('file', 'message.dat', data)]) except: # not an error if the server isn't responding --- 134,143 ---- data = sys.stdin.read() ! if not null: ! sys.stdout.write(data) if random.random() < prob: try: post_multipart("%s:%d"%(server,port), "/upload", [], ! [('file', 'message.dat', data)]) except: # not an error if the server isn't responding From anadelonbrin at users.sourceforge.net Fri Oct 15 07:44:59 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Fri Oct 15 07:45:00 2004 Subject: [Spambayes-checkins] spambayes/scripts sb_mailsort.py,1.1,1.1.6.1 Message-ID: Update of /cvsroot/spambayes/spambayes/scripts In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv17507/scripts Modified Files: Tag: release_1_0-branch sb_mailsort.py Log Message: Backport fix for broken lambda. Index: sb_mailsort.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/scripts/sb_mailsort.py,v retrieving revision 1.1 retrieving revision 1.1.6.1 diff -C2 -d -r1.1 -r1.1.6.1 *** sb_mailsort.py 5 Sep 2003 01:16:45 -0000 1.1 --- sb_mailsort.py 15 Oct 2004 05:44:56 -0000 1.1.6.1 *************** *** 100,104 **** def filter_message(hamdir, spamdir): ! signal.signal(signal.SIGALRM, lambda s: sys.exit(1)) signal.alarm(24 * 60 * 60) --- 100,104 ---- def filter_message(hamdir, spamdir): ! signal.signal(signal.SIGALRM, lambda s, f: sys.exit(1)) signal.alarm(24 * 60 * 60) From anadelonbrin at users.sourceforge.net Fri Oct 15 07:45:43 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Fri Oct 15 07:45:46 2004 Subject: [Spambayes-checkins] spambayes/scripts sb_mboxtrain.py, 1.11.4.1, 1.11.4.2 Message-ID: Update of /cvsroot/spambayes/spambayes/scripts In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv17641/scripts Modified Files: Tag: release_1_0-branch sb_mboxtrain.py Log Message: Backport typo in docstring. Index: sb_mboxtrain.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/scripts/sb_mboxtrain.py,v retrieving revision 1.11.4.1 retrieving revision 1.11.4.2 diff -C2 -d -r1.11.4.1 -r1.11.4.2 *** sb_mboxtrain.py 10 Jun 2004 05:21:16 -0000 1.11.4.1 --- sb_mboxtrain.py 15 Oct 2004 05:45:41 -0000 1.11.4.2 *************** *** 60,64 **** """Return an email Message object. ! This works like mboxutis.get_message, except it doesn't junk the headers if there's an error. Doing so would cause a headerless message to be written back out! --- 60,64 ---- """Return an email Message object. ! This works like mboxutils.get_message, except it doesn't junk the headers if there's an error. Doing so would cause a headerless message to be written back out! From anadelonbrin at users.sourceforge.net Fri Oct 15 07:48:20 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Fri Oct 15 07:48:22 2004 Subject: [Spambayes-checkins] spambayes/scripts sb_server.py,1.21,1.21.4.1 Message-ID: Update of /cvsroot/spambayes/spambayes/scripts In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv18091/scripts Modified Files: Tag: release_1_0-branch sb_server.py Log Message: Backport fix for sb_server command line options not working correctly. Index: sb_server.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/scripts/sb_server.py,v retrieving revision 1.21 retrieving revision 1.21.4.1 diff -C2 -d -r1.21 -r1.21.4.1 *** sb_server.py 16 Mar 2004 05:08:31 -0000 1.21 --- sb_server.py 15 Oct 2004 05:48:17 -0000 1.21.4.1 *************** *** 640,643 **** --- 640,651 ---- self.init() + # Load up the other settings from Option.py / bayescustomize.ini + self.uiPort = options["html_ui", "port"] + self.launchUI = options["html_ui", "launch_browser"] + self.gzipCache = options["Storage", "cache_use_gzip"] + self.cacheExpiryDays = options["Storage", "cache_expiry_days"] + self.runTestServer = False + self.isTest = False + def init(self): assert not self.prepared, "init after prepare, but before close" *************** *** 664,674 **** sys.exit() ! # Load up the other settings from Option.py / bayescustomize.ini ! self.uiPort = options["html_ui", "port"] ! self.launchUI = options["html_ui", "launch_browser"] ! self.gzipCache = options["Storage", "cache_use_gzip"] ! self.cacheExpiryDays = options["Storage", "cache_expiry_days"] ! self.runTestServer = False ! self.isTest = False # Set up the statistics. --- 672,682 ---- sys.exit() ! ! ! ! ! ! ! # Set up the statistics. *************** *** 910,914 **** # Read the arguments. try: ! opts, args = getopt.getopt(sys.argv[1:], 'hbpsd:p:l:u:o:') except getopt.error, msg: print >>sys.stderr, str(msg) + '\n\n' + __doc__ --- 918,922 ---- # Read the arguments. try: ! opts, args = getopt.getopt(sys.argv[1:], 'hbd:p:l:u:o:') except getopt.error, msg: print >>sys.stderr, str(msg) + '\n\n' + __doc__ From anadelonbrin at users.sourceforge.net Fri Oct 15 07:54:56 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Fri Oct 15 07:55:00 2004 Subject: [Spambayes-checkins] spambayes/spambayes Options.py, 1.107, 1.107.4.1 Message-ID: Update of /cvsroot/spambayes/spambayes/spambayes In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv19106/spambayes Modified Files: Tag: release_1_0-branch Options.py Log Message: Backport docstring option fixes. Backport fix for notate_to/notate_subject with different classification strings. Index: Options.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/spambayes/Options.py,v retrieving revision 1.107 retrieving revision 1.107.4.1 diff -C2 -d -r1.107 -r1.107.4.1 *** Options.py 12 Apr 2004 01:59:26 -0000 1.107 --- Options.py 15 Oct 2004 05:54:54 -0000 1.107.4.1 *************** *** 560,565 **** better at classifying your email. This option specifies the name of the database file. If you don't give a full pathname, ! the name will be taken to be relative to the current working ! directory.""", FILE_WITH_PATH, DO_NOT_RESTORE), --- 560,565 ---- better at classifying your email. This option specifies the name of the database file. If you don't give a full pathname, ! the name will be taken to be relative to the location of the ! most recent configuration file loaded.""", FILE_WITH_PATH, DO_NOT_RESTORE), *************** *** 570,575 **** or reclassified (unless specifically requested to). This option specifies the name of the database file. If you don't give a ! full pathname, the name will be taken to be relative to the current ! working directory.""", FILE_WITH_PATH, DO_NOT_RESTORE), --- 570,575 ---- or reclassified (unless specifically requested to). This option specifies the name of the database file. If you don't give a ! full pathname, the name will be taken to be relative to the location ! of the most recent configuration file loaded.""", FILE_WITH_PATH, DO_NOT_RESTORE), *************** *** 581,585 **** """Messages will be expired from the cache after this many days. After this time, you will no longer be able to train on these messages ! (note this does not effect the copy of the message that you have in your mail client).""", INTEGER, RESTORE), --- 581,585 ---- """Messages will be expired from the cache after this many days. After this time, you will no longer be able to train on these messages ! (note this does not affect the copy of the message that you have in your mail client).""", INTEGER, RESTORE), *************** *** 798,802 **** SERVER, DO_NOT_RESTORE), ! ("allow_remote_connections", "Allowed remote connections", "localhost", """Enter a list of trusted IPs, separated by commas. Remote POP connections from any of them will be allowed. You can trust any --- 798,802 ---- SERVER, DO_NOT_RESTORE), ! ("allow_remote_connections", "Allowed remote POP3 connections", "localhost", """Enter a list of trusted IPs, separated by commas. Remote POP connections from any of them will be allowed. You can trust any *************** *** 836,840 **** SERVER, DO_NOT_RESTORE), ! ("allow_remote_connections", "Allowed remote connections", "localhost", """Enter a list of trusted IPs, separated by commas. Remote SMTP connections from any of them will be allowed. You can trust any --- 836,840 ---- SERVER, DO_NOT_RESTORE), ! ("allow_remote_connections", "Allowed remote SMTP connections", "localhost", """Enter a list of trusted IPs, separated by commas. Remote SMTP connections from any of them will be allowed. You can trust any *************** *** 893,897 **** BOOLEAN, RESTORE), ! ("allow_remote_connections", "Allowed remote connections", "localhost", """Enter a list of trusted IPs, separated by commas. Remote connections from any of them will be allowed. You can trust any --- 893,897 ---- BOOLEAN, RESTORE), ! ("allow_remote_connections", "Allowed remote UI connections", "localhost", """Enter a list of trusted IPs, separated by commas. Remote connections from any of them will be allowed. You can trust any *************** *** 1229,1232 **** --- 1229,1247 ---- options.merge_file(optionsPathname) + # Annoyingly, we have a special case. The notate_to and notate_subject + # allowed values have to be set to the same values as the header_x_ + # options, but this can't be done (AFAIK) dynmaically. If this isn't + # the case, then if the header_x_string values are changed, the + # notate_ options don't work. Outlook Express users like both of + # these options...so we fix it here. See also sf #944109. + header_strings = (options["Headers", "header_ham_string"], + options["Headers", "header_spam_string"], + options["Headers", "header_unsure_string"]) + notate_to = options.get_option("Headers", "notate_to") + notate_subject = options.get_option("Headers", "notate_subject") + notate_to.allowed_values = header_strings + notate_subject.allowed_values = header_strings + + def get_pathname_option(section, option): """Return the option relative to the path specified in the From anadelonbrin at users.sourceforge.net Fri Oct 15 07:56:13 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Fri Oct 15 07:56:16 2004 Subject: [Spambayes-checkins] spambayes/scripts sb_server.py, 1.21.4.1, 1.21.4.2 Message-ID: Update of /cvsroot/spambayes/spambayes/scripts In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv19312/scripts Modified Files: Tag: release_1_0-branch sb_server.py Log Message: Opps. My merge left a lot of blank lines instead of removing them. Index: sb_server.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/scripts/sb_server.py,v retrieving revision 1.21.4.1 retrieving revision 1.21.4.2 diff -C2 -d -r1.21.4.1 -r1.21.4.2 *** sb_server.py 15 Oct 2004 05:48:17 -0000 1.21.4.1 --- sb_server.py 15 Oct 2004 05:56:01 -0000 1.21.4.2 *************** *** 672,683 **** sys.exit() - - - - - - - - # Set up the statistics. self.totalSessions = 0 --- 672,675 ---- From anadelonbrin at users.sourceforge.net Fri Oct 15 08:01:06 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Fri Oct 15 08:01:08 2004 Subject: [Spambayes-checkins] spambayes/spambayes message.py, 1.49.4.2, 1.49.4.3 Message-ID: Update of /cvsroot/spambayes/spambayes/spambayes In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv20152/spambayes Modified Files: Tag: release_1_0-branch message.py Log Message: Backport fix for selecting message info database type when using sql etc. Backport fix of StringIO import. Index: message.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/spambayes/message.py,v retrieving revision 1.49.4.2 retrieving revision 1.49.4.3 diff -C2 -d -r1.49.4.2 -r1.49.4.3 *** message.py 23 Jun 2004 23:05:47 -0000 1.49.4.2 --- message.py 15 Oct 2004 06:01:01 -0000 1.49.4.3 *************** *** 105,109 **** from spambayes.tokenizer import tokenize ! from cStringIO import StringIO CRLF_RE = re.compile(r'\r\n|\r|\n') --- 105,112 ---- from spambayes.tokenizer import tokenize ! try: ! import cStringIO as StringIO ! except ImportError: ! import StringIO CRLF_RE = re.compile(r'\r\n|\r|\n') *************** *** 207,217 **** message_info_db_name = get_pathname_option("Storage", "messageinfo_storage_file") if options["Storage", "persistent_use_database"] is True or \ ! options["Storage", "persistent_use_database"] == "True" or \ options["Storage", "persistent_use_database"] == "dbm": msginfoDB = MessageInfoDB(message_info_db_name) elif options["Storage", "persistent_use_database"] is False or \ ! options["Storage", "persistent_use_database"] == "False" or \ options["Storage", "persistent_use_database"] == "pickle": msginfoDB = MessageInfoPickle(message_info_db_name) class Message(email.Message.Message): --- 210,226 ---- message_info_db_name = get_pathname_option("Storage", "messageinfo_storage_file") if options["Storage", "persistent_use_database"] is True or \ ! options["Storage", "persistent_use_database"] == "dbm": msginfoDB = MessageInfoDB(message_info_db_name) elif options["Storage", "persistent_use_database"] is False or \ ! options["Storage", "persistent_use_database"] == "pickle": msginfoDB = MessageInfoPickle(message_info_db_name) + else: + # Ah - now, what? Maybe the user has mysql or pgsql or zeo, + # or some other newfangled thing! We don't know what to do + # in that case, so just use a pickle, since it's the safest + # option. + msginfoDB = MessageInfoPickle(message_info_db_name) class Message(email.Message.Message): From anadelonbrin at users.sourceforge.net Fri Oct 15 08:03:36 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Fri Oct 15 08:03:39 2004 Subject: [Spambayes-checkins] spambayes/spambayes ImapUI.py,1.36,1.36.4.1 Message-ID: Update of /cvsroot/spambayes/spambayes/spambayes In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv20592/spambayes Modified Files: Tag: release_1_0-branch ImapUI.py Log Message: Backport fix for displaying error when trying to display IMAP folder names and not having sufficient information. Index: ImapUI.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/spambayes/ImapUI.py,v retrieving revision 1.36 retrieving revision 1.36.4.1 diff -C2 -d -r1.36 -r1.36.4.1 *** ImapUI.py 27 Jan 2004 08:37:14 -0000 1.36 --- ImapUI.py 15 Oct 2004 06:03:29 -0000 1.36.4.1 *************** *** 44,47 **** --- 44,48 ---- import re import cgi + import types import UserInterface *************** *** 202,205 **** --- 203,213 ---- port = 143 self.imap = self.imap_session_class(server, port) + if not self.imap.connected: + # Failed to connect. + content = self._buildBox("Error", None, + "Please check server/port details.") + self.write(content) + self._writePostamble() + return if self.imap is None: content = self._buildBox("Error", None, *************** *** 208,213 **** self._writePostamble() return ! username = options["imap", "username"][0] ! if username == "": content = self._buildBox("Error", None, """Must specify username first.""") --- 216,223 ---- self._writePostamble() return ! username = options["imap", "username"] ! if isinstance(username, types.TupleType): ! username = username[0] ! if not username: content = self._buildBox("Error", None, """Must specify username first.""") *************** *** 215,218 **** --- 225,238 ---- self._writePostamble() return + if not self.imap_pwd: + self.imap_pwd = options["imap", "password"] + if isinstance(self.imap_pwd, types.TupleType): + self.imap_pwd = self.imap_pwd[0] + if not self.imap_pwd: + content = self._buildBox("Error", None, + """Must specify password first.""") + self.write(content) + self._writePostamble() + return self.imap.login(username, self.imap_pwd) self.imap_logged_in = True From anadelonbrin at users.sourceforge.net Fri Oct 15 08:05:36 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Fri Oct 15 08:05:39 2004 Subject: [Spambayes-checkins] spambayes/spambayes Dibbler.py,1.13,1.13.4.1 Message-ID: Update of /cvsroot/spambayes/spambayes/spambayes In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv20939/spambayes Modified Files: Tag: release_1_0-branch Dibbler.py Log Message: Backport fix to work with Python 2.4. Index: Dibbler.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/spambayes/Dibbler.py,v retrieving revision 1.13 retrieving revision 1.13.4.1 diff -C2 -d -r1.13 -r1.13.4.1 *** Dibbler.py 12 Jan 2004 14:13:01 -0000 1.13 --- Dibbler.py 15 Oct 2004 06:05:33 -0000 1.13.4.1 *************** *** 189,193 **** """See `asynchat.async_chat`.""" asynchat.async_chat.__init__(self, conn) ! self._map = map self._closed = False --- 189,193 ---- """See `asynchat.async_chat`.""" asynchat.async_chat.__init__(self, conn) ! self.__map = map self._closed = False *************** *** 216,220 **** """Remove this object from the correct socket map.""" self._closed = True ! self.del_channel(self._map) self.socket.close() --- 216,220 ---- """Remove this object from the correct socket map.""" self._closed = True ! self.del_channel(self.__map) self.socket.close() From anadelonbrin at users.sourceforge.net Fri Oct 15 08:07:21 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Fri Oct 15 08:07:23 2004 Subject: [Spambayes-checkins] spambayes/spambayes/resources ui.html, 1.33, 1.33.4.1 Message-ID: Update of /cvsroot/spambayes/spambayes/spambayes/resources In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv21277/spambayes/resources Modified Files: Tag: release_1_0-branch ui.html Log Message: Backport Spambayes->SpamBayes corrections. Index: ui.html =================================================================== RCS file: /cvsroot/spambayes/spambayes/spambayes/resources/ui.html,v retrieving revision 1.33 retrieving revision 1.33.4.1 diff -C2 -d -r1.33 -r1.33.4.1 *** ui.html 15 Mar 2004 23:06:44 -0000 1.33 --- ui.html 15 Oct 2004 06:07:17 -0000 1.33.4.1 *************** *** 2,6 **** ! Spambayes User Interface --- 2,6 ---- ! SpamBayes User Interface *************** *** 45,49 ****   ! Spambayes Web Interface: Home > ui.html --- 45,49 ----   ! SpamBayes Web Interface: Home > ui.html *************** *** 56,62 ****

    This file, ui.html, defines the look-and-feel ! of the user interface of the Spambayes Server. The various pieces of HTML defined here are extracted and manipulated at ! runtime to dynamically produce the HTML that the Spambayes Server serves up - this file acts as a palette of HTML components. PyMeldLite is the module that provides --- 56,62 ----

    This file, ui.html, defines the look-and-feel ! of the user interface of the SpamBayes Server. The various pieces of HTML defined here are extracted and manipulated at ! runtime to dynamically produce the HTML that the SpamBayes Server serves up - this file acts as a palette of HTML components. PyMeldLite is the module that provides *************** *** 209,213 **** !       You can configure your Spambayes
          system using the Configuration page. --- 209,213 ---- !       You can configure your SpamBayes
          system using the Configuration page. *************** *** 225,229 ****

    ! The Spambayes proxy stores all the messages it sees. You can train the classifier based on those messages using the Review messages page. --- 225,229 ----

    ! The SpamBayes proxy stores all the messages it sees. You can train the classifier based on those messages using the Review messages page. *************** *** 318,322 **** ! Re: Spambayes and PyMeld rock! 8-) --- 318,322 ---- ! Re: SpamBayes and PyMeld rock! 8-) *************** *** 502,506 ****

    This page allows you to change the options that control how ! Spambayes processes your email. Your options are stored in /example/pathname.

    --- 502,506 ----

    This page allows you to change the options that control how ! SpamBayes processes your email. Your options are stored in /example/pathname.

    *************** *** 597,601 **** Version 0.00
    ! Spambayes Web Interface, Mon Dec 30 14:04:32 2002. Spambayes.org --- 597,601 ---- Version 0.00
    ! SpamBayes Web Interface, Mon Dec 30 14:04:32 2002. Spambayes.org *************** *** 613,614 **** --- 613,615 ---- + From anadelonbrin at users.sourceforge.net Sun Oct 17 00:37:12 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Sun Oct 17 00:37:14 2004 Subject: [Spambayes-checkins] spambayes/Outlook2000 addin.py,1.132,1.133 Message-ID: Update of /cvsroot/spambayes/spambayes/Outlook2000 In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv21799/Outlook2000 Modified Files: addin.py Log Message: Back out OnItemAdd name printing change, which I didn't mean to check it (it doesn't work properly yet). Apologies for the dud version. Index: addin.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/Outlook2000/addin.py,v retrieving revision 1.132 retrieving revision 1.133 diff -C2 -d -r1.132 -r1.133 *** addin.py 14 Oct 2004 23:36:12 -0000 1.132 --- addin.py 16 Oct 2004 22:37:10 -0000 1.133 *************** *** 380,384 **** # Callback from Outlook - locale may have changed. locale.setlocale(locale.LC_NUMERIC, "C") # see locale comments above ! self.manager.LogDebug(2, "OnItemAdd event for folder", self.name, "with item", item.Subject.encode("mbcs", "ignore")) # Due to the way our "missed message" indicator works, we do --- 380,384 ---- # Callback from Outlook - locale may have changed. locale.setlocale(locale.LC_NUMERIC, "C") # see locale comments above ! self.manager.LogDebug(2, "OnItemAdd event for folder", self, "with item", item.Subject.encode("mbcs", "ignore")) # Due to the way our "missed message" indicator works, we do From anadelonbrin at users.sourceforge.net Sun Oct 17 00:57:59 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Sun Oct 17 00:58:02 2004 Subject: [Spambayes-checkins] spambayes/Outlook2000/installer .cvsignore, 1.1, NONE README.txt, 1.1, NONE crank.py, 1.3, NONE installation_notes.rtf, 1.3, NONE spambayes_addin.iss, 1.12, NONE spambayes_addin.py, 1.2, NONE spambayes_addin.spec, 1.8, NONE Message-ID: Update of /cvsroot/spambayes/spambayes/Outlook2000/installer In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv26689/Outlook2000/installer Removed Files: .cvsignore README.txt crank.py installation_notes.rtf spambayes_addin.iss spambayes_addin.py spambayes_addin.spec Log Message: Using McMillan's installer to build the Outlook plug-in binary isn't supported anymore (these scripts probably don't even work anymore), since we've moved to py2exe. Having these here is pointless and might confuse people, so remove them, as per Kenny's suggestion. --- .cvsignore DELETED --- --- README.txt DELETED --- --- crank.py DELETED --- --- installation_notes.rtf DELETED --- --- spambayes_addin.iss DELETED --- --- spambayes_addin.py DELETED --- --- spambayes_addin.spec DELETED --- From anadelonbrin at users.sourceforge.net Mon Oct 18 01:12:54 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Mon Oct 18 01:12:57 2004 Subject: [Spambayes-checkins] spambayes/Outlook2000 oastats.py,1.5,1.6 Message-ID: Update of /cvsroot/spambayes/spambayes/Outlook2000 In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv17733/Outlook2000 Modified Files: oastats.py Log Message: We were showing the total number of unsure messages, but the session percentage of unsure messages. Fix this error spotted by Erik Brown. Index: oastats.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/Outlook2000/oastats.py,v retrieving revision 1.5 retrieving revision 1.6 diff -C2 -d -r1.5 -r1.6 *** oastats.py 14 Oct 2004 23:44:36 -0000 1.5 --- oastats.py 17 Oct 2004 23:12:51 -0000 1.6 *************** *** 113,117 **** perc_ham = 100.0 * num_ham / num_seen perc_spam = 100.0 * num_spam / num_seen ! perc_unsure = 100.0 * self.num_unsure / num_seen format_dict = locals().copy() del format_dict["self"] --- 113,117 ---- perc_ham = 100.0 * num_ham / num_seen perc_spam = 100.0 * num_spam / num_seen ! perc_unsure = 100.0 * num_unsure / num_seen format_dict = locals().copy() del format_dict["self"] From anadelonbrin at users.sourceforge.net Mon Oct 18 03:01:41 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Mon Oct 18 03:01:43 2004 Subject: [Spambayes-checkins] spambayes/spambayes TestDriver.py,1.4,1.5 Message-ID: Update of /cvsroot/spambayes/spambayes/spambayes In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv13433/spambayes Modified Files: TestDriver.py Log Message: If show_histograms was False, then the global ham/spam histogram never had the stats computed, but this gets used later, so the script would die with an AtrributeError. Fix that. Index: TestDriver.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/spambayes/TestDriver.py,v retrieving revision 1.4 retrieving revision 1.5 diff -C2 -d -r1.4 -r1.5 *** TestDriver.py 5 Sep 2003 01:15:28 -0000 1.4 --- TestDriver.py 18 Oct 2004 01:01:38 -0000 1.5 *************** *** 206,209 **** --- 206,211 ---- besthamcut = options["Categorization", "ham_cutoff"] bestspamcut = options["Categorization", "spam_cutoff"] + self.global_ham_hist.compute_stats() + self.global_spam_hist.compute_stats() nham = self.global_ham_hist.n nspam = self.global_spam_hist.n From anadelonbrin at users.sourceforge.net Mon Oct 18 07:26:40 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Mon Oct 18 07:26:42 2004 Subject: [Spambayes-checkins] spambayes/spambayes msgs.py,1.2,1.3 Message-ID: Update of /cvsroot/spambayes/spambayes/spambayes In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv2639/spambayes Modified Files: msgs.py Log Message: I keep running into this, and as far as I can tell, it doesn't hurt to add this, so: Make msgs.Msg objects picklable. Index: msgs.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/spambayes/msgs.py,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** msgs.py 14 Jan 2003 05:38:20 -0000 1.2 --- msgs.py 18 Oct 2004 05:26:37 -0000 1.3 *************** *** 35,38 **** --- 35,44 ---- return self.guts + # We have defined __slots__, so need these to be able to be pickled. + def __getstate__(self): + return self.tag, self.guts + def __setstate__(self, s): + self.tag, self.guts = s + # The iterator yields a stream of Msg objects, taken from a list of # directories. *************** *** 40,51 **** __slots__ = 'tag', 'directories', 'keep' ! def __init__(self, tag, directories, keep=None): self.tag = tag self.directories = directories self.keep = keep def __str__(self): return self.tag def produce(self): if self.keep is None: --- 46,70 ---- __slots__ = 'tag', 'directories', 'keep' ! def __init__(self, tag, directories, keep=None, use=None): self.tag = tag self.directories = directories self.keep = keep + self.use = use def __str__(self): return self.tag + def __len__(self): + """Number of messages in the stream, which is the number + of files in the directory.""" + files = [] + for directory in self.directories: + files.extend(os.listdir(directory)) + if self.keep is not None: + del files[self.keep:] + elif self.use is not None: + files = files[self.use[0]:self.use[1]] + return len(files) + def produce(self): if self.keep is None: *************** *** 61,65 **** random.seed(hash(max(all)) ^ SEED) # reproducible across calls random.shuffle(all) ! del all[self.keep:] all.sort() # seems to speed access on Win98! for fname in all: --- 80,87 ---- random.seed(hash(max(all)) ^ SEED) # reproducible across calls random.shuffle(all) ! if self.use is None: ! del all[self.keep:] ! else: ! all = all[self.use[0]:self.use[1]] all.sort() # seems to speed access on Win98! for fname in all: From anadelonbrin at users.sourceforge.net Mon Oct 18 07:30:50 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Mon Oct 18 07:30:54 2004 Subject: [Spambayes-checkins] spambayes/testtools timcv.py,1.7,1.8 Message-ID: Update of /cvsroot/spambayes/spambayes/testtools In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv3106/testtools Modified Files: timcv.py Log Message: It's handy when testing to be able to toggle an option without editing a configuration script, so copy Skip's -o command line option (available in all the regular scripts) to timcv.py. Index: timcv.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/testtools/timcv.py,v retrieving revision 1.7 retrieving revision 1.8 diff -C2 -d -r1.7 -r1.8 *** timcv.py 7 Apr 2004 06:49:04 -0000 1.7 --- timcv.py 18 Oct 2004 05:30:47 -0000 1.8 *************** *** 11,14 **** --- 11,16 ---- Number of Set directories (Data/Spam/Set1, ... and Data/Ham/Set1, ...). This is required. + -o section:option:value + set [section, option] in the options database to value If you only want to use some of the messages in each set, *************** *** 127,131 **** try: ! opts, args = getopt.getopt(sys.argv[1:], 'hn:s:', ['HamTrain=', 'SpamTrain=', 'HamTest=', 'SpamTest=', --- 129,133 ---- try: ! opts, args = getopt.getopt(sys.argv[1:], 'hn:s:o:', ['HamTrain=', 'SpamTrain=', 'HamTest=', 'SpamTest=', *************** *** 155,158 **** --- 157,162 ---- elif opt == '--spam-keep': spamkeep = int(arg) + elif opt in ('-o', '--option'): + options.set_from_cmdline(arg, sys.stderr) if args: From anadelonbrin at users.sourceforge.net Mon Oct 18 07:35:19 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Mon Oct 18 07:35:22 2004 Subject: [Spambayes-checkins] spambayes/spambayes msgs.py,1.3,1.4 Message-ID: Update of /cvsroot/spambayes/spambayes/spambayes In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv3585/spambayes Modified Files: msgs.py Log Message: Back out some changes that accidentally piggybacked their way in with the last check-in. Index: msgs.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/spambayes/msgs.py,v retrieving revision 1.3 retrieving revision 1.4 diff -C2 -d -r1.3 -r1.4 *** msgs.py 18 Oct 2004 05:26:37 -0000 1.3 --- msgs.py 18 Oct 2004 05:35:10 -0000 1.4 *************** *** 46,70 **** __slots__ = 'tag', 'directories', 'keep' ! def __init__(self, tag, directories, keep=None, use=None): self.tag = tag self.directories = directories self.keep = keep - self.use = use def __str__(self): return self.tag - def __len__(self): - """Number of messages in the stream, which is the number - of files in the directory.""" - files = [] - for directory in self.directories: - files.extend(os.listdir(directory)) - if self.keep is not None: - del files[self.keep:] - elif self.use is not None: - files = files[self.use[0]:self.use[1]] - return len(files) - def produce(self): if self.keep is None: --- 46,57 ---- __slots__ = 'tag', 'directories', 'keep' ! def __init__(self, tag, directories, keep=None): self.tag = tag self.directories = directories self.keep = keep def __str__(self): return self.tag def produce(self): if self.keep is None: *************** *** 80,87 **** random.seed(hash(max(all)) ^ SEED) # reproducible across calls random.shuffle(all) ! if self.use is None: ! del all[self.keep:] ! else: ! all = all[self.use[0]:self.use[1]] all.sort() # seems to speed access on Win98! for fname in all: --- 67,71 ---- random.seed(hash(max(all)) ^ SEED) # reproducible across calls random.shuffle(all) ! del all[self.keep:] all.sort() # seems to speed access on Win98! for fname in all: From anadelonbrin at users.sourceforge.net Wed Oct 20 02:03:49 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Wed Oct 20 02:03:53 2004 Subject: [Spambayes-checkins] spambayes/Outlook2000 oastats.py,1.6,1.7 Message-ID: Update of /cvsroot/spambayes/spambayes/Outlook2000 In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv5715/Outlook2000 Modified Files: oastats.py Log Message: Let the statistics have a variable number of decimal places for the percentages (1 by default). Index: oastats.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/Outlook2000/oastats.py,v retrieving revision 1.6 retrieving revision 1.7 diff -C2 -d -r1.6 -r1.7 *** oastats.py 17 Oct 2004 23:12:51 -0000 1.6 --- oastats.py 20 Oct 2004 00:03:47 -0000 1.7 *************** *** 71,75 **** if score < self.config.filter.unsure_threshold: self.num_deleted_spam_fn += 1 ! def GetStats(self, session_only=False): """Return a description of the statistics. --- 71,75 ---- if score < self.config.filter.unsure_threshold: self.num_deleted_spam_fn += 1 ! def GetStats(self, session_only=False, decimal_points=1): """Return a description of the statistics. *************** *** 81,84 **** --- 81,87 ---- those by default. If session-only stats are desired, then a special call to here can be made. + + The percentages will be accurate to the given number of decimal + points. """ num_seen = self.num_ham + self.num_spam + self.num_unsure *************** *** 120,127 **** format_dict.update(dict(perc_spam=perc_spam, perc_ham=perc_ham, perc_unsure=perc_unsure, num_seen=num_seen)) ! push("SpamBayes has processed %(num_seen)d messages - " \ ! "%(num_ham)d (%(perc_ham).0f%%) good, " \ ! "%(num_spam)d (%(perc_spam).0f%%) spam " \ ! "and %(num_unsure)d (%(perc_unsure).0f%%) unsure" % format_dict) if num_recovered_good: push("%(num_recovered_good)d message(s) were manually " \ --- 123,139 ---- format_dict.update(dict(perc_spam=perc_spam, perc_ham=perc_ham, perc_unsure=perc_unsure, num_seen=num_seen)) ! format_dict["perc_ham_s"] = "%%(perc_ham).%df%%(perc)s" \ ! % (decimal_points,) ! format_dict["perc_spam_s"] = "%%(perc_spam).%df%%(perc)s" \ ! % (decimal_points,) ! format_dict["perc_unsure_s"] = "%%(perc_unsure).%df%%(perc)s" \ ! % (decimal_points,) ! format_dict["perc"] = "%" ! push(("SpamBayes has processed %(num_seen)d messages - " \ ! "%(num_ham)d (%(perc_ham_s)s) good, " \ ! "%(num_spam)d (%(perc_spam_s)s) spam " \ ! "and %(num_unsure)d (%(perc_unsure_s)s) unsure" \ ! % format_dict) % format_dict) ! if num_recovered_good: push("%(num_recovered_good)d message(s) were manually " \ From anadelonbrin at users.sourceforge.net Thu Oct 21 00:09:05 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Thu Oct 21 00:09:08 2004 Subject: [Spambayes-checkins] spambayes/spambayes classifier.py,1.26,1.27 Message-ID: Update of /cvsroot/spambayes/spambayes/spambayes In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv24129/spambayes Modified Files: classifier.py Log Message: Fix [ 1051081 ] uncaught socket timeoutexception slurping URLs If a socket.error occurs during the reading process, just bail out without slurping anything. We could generate some sort of synthetic token, but these are likely to be temporary errors (the connecting socket.errors are more likely to be permanent), so will be even worse for msg-token consistency than this option already has the possibility of being. Index: classifier.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/spambayes/classifier.py,v retrieving revision 1.26 retrieving revision 1.27 diff -C2 -d -r1.26 -r1.27 *** classifier.py 9 Aug 2004 06:50:04 -0000 1.26 --- classifier.py 20 Oct 2004 22:09:01 -0000 1.27 *************** *** 716,729 **** pass ! # Anything that isn't text/html is ignored ! content_type = f.info().get('content-type') ! if content_type is None or \ ! not content_type.startswith("text/html"): ! self.bad_urls["url:non_html"] += (url,) ! return ["url:non_html"] ! page = f.read() ! headers = str(f.info()) ! f.close() fake_message_string = headers + "\r\n" + page --- 716,735 ---- pass ! try: ! # Anything that isn't text/html is ignored ! content_type = f.info().get('content-type') ! if content_type is None or \ ! not content_type.startswith("text/html"): ! self.bad_urls["url:non_html"] += (url,) ! return ["url:non_html"] ! page = f.read() ! headers = str(f.info()) ! f.close() ! except socket.error: ! # This is probably a temporary error, like a timeout. ! # For now, just bail out. ! return [] ! fake_message_string = headers + "\r\n" + page From anadelonbrin at users.sourceforge.net Thu Oct 21 05:27:52 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Thu Oct 21 05:27:55 2004 Subject: [Spambayes-checkins] spambayes/Outlook2000/docs troubleshooting.html, 1.22, 1.23 Message-ID: Update of /cvsroot/spambayes/spambayes/Outlook2000/docs In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv32584/Outlook2000/docs Modified Files: troubleshooting.html Log Message: Spelling mistake. Index: troubleshooting.html =================================================================== RCS file: /cvsroot/spambayes/spambayes/Outlook2000/docs/troubleshooting.html,v retrieving revision 1.22 retrieving revision 1.23 diff -C2 -d -r1.22 -r1.23 *** troubleshooting.html 27 May 2004 14:35:39 -0000 1.22 --- troubleshooting.html 21 Oct 2004 03:27:37 -0000 1.23 *************** *** 106,110 **** style="font-style: italic;">Options
    to display the main Options dialog. !
  • Select the tab labeled Other, then click on the Advanced button.
  • --- 106,110 ---- style="font-style: italic;">Options
    to display the main Options dialog. !
  • Select the tab labelled Other, then click on the Advanced button.
  • From anadelonbrin at users.sourceforge.net Fri Oct 22 02:06:48 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Fri Oct 22 02:06:52 2004 Subject: [Spambayes-checkins] spambayes/Outlook2000/docs troubleshooting.html, 1.23, 1.24 Message-ID: Update of /cvsroot/spambayes/spambayes/Outlook2000/docs In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv12825/Outlook2000/docs Modified Files: troubleshooting.html Log Message: Kenny informs me that not only can't Americans spell 'colour', but they can't spell 'labelled', either . Since "labeled" is correct US English, back out my last change. Index: troubleshooting.html =================================================================== RCS file: /cvsroot/spambayes/spambayes/Outlook2000/docs/troubleshooting.html,v retrieving revision 1.23 retrieving revision 1.24 diff -C2 -d -r1.23 -r1.24 *** troubleshooting.html 21 Oct 2004 03:27:37 -0000 1.23 --- troubleshooting.html 22 Oct 2004 00:06:44 -0000 1.24 *************** *** 106,110 **** style="font-style: italic;">Options to display the main Options dialog. !
  • Select the tab labelled Other, then click on the Advanced button.
  • --- 106,110 ---- style="font-style: italic;">Options to display the main Options dialog. !
  • Select the tab labeled Other, then click on the Advanced button.
  • From anadelonbrin at users.sourceforge.net Fri Oct 22 07:00:54 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Fri Oct 22 07:00:58 2004 Subject: [Spambayes-checkins] spambayes/spambayes message.py, 1.49.4.3, 1.49.4.4 Message-ID: Update of /cvsroot/spambayes/spambayes/spambayes In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv6871/spambayes Modified Files: Tag: release_1_0-branch message.py Log Message: Opps. My merging left blank lines where it shouldn't have, which meant a syntax error. Remove those. Index: message.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/spambayes/message.py,v retrieving revision 1.49.4.3 retrieving revision 1.49.4.4 diff -C2 -d -r1.49.4.3 -r1.49.4.4 *** message.py 15 Oct 2004 06:01:01 -0000 1.49.4.3 --- message.py 22 Oct 2004 05:00:51 -0000 1.49.4.4 *************** *** 210,218 **** message_info_db_name = get_pathname_option("Storage", "messageinfo_storage_file") if options["Storage", "persistent_use_database"] is True or \ - options["Storage", "persistent_use_database"] == "dbm": msginfoDB = MessageInfoDB(message_info_db_name) elif options["Storage", "persistent_use_database"] is False or \ - options["Storage", "persistent_use_database"] == "pickle": msginfoDB = MessageInfoPickle(message_info_db_name) --- 210,216 ---- From anadelonbrin at users.sourceforge.net Wed Oct 27 04:25:10 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Wed Oct 27 04:25:13 2004 Subject: [Spambayes-checkins] spambayes/contrib sb_culler.py,1.1,1.2 Message-ID: Update of /cvsroot/spambayes/spambayes/contrib In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv13930/contrib Modified Files: sb_culler.py Log Message: Update to match current open_storage() usage. Index: sb_culler.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/contrib/sb_culler.py,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** sb_culler.py 11 Jun 2004 03:16:21 -0000 1.1 --- sb_culler.py 27 Oct 2004 02:25:07 -0000 1.2 *************** *** 346,350 **** # Use SpamBayes to identify spam. Make a local copy then # delete from the server. ! h = hammie.open("cull.spambayes", False, "r") filters.add(IsSpam(h, 0.90), AppendFile("spam.mbox")) --- 346,350 ---- # Use SpamBayes to identify spam. Make a local copy then # delete from the server. ! h = hammie.open("cull.spambayes", "dbm", "r") filters.add(IsSpam(h, 0.90), AppendFile("spam.mbox")) From anadelonbrin at users.sourceforge.net Wed Oct 27 04:36:27 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Wed Oct 27 04:36:30 2004 Subject: [Spambayes-checkins] spambayes/contrib sb_culler.py,1.2,1.3 Message-ID: Update of /cvsroot/spambayes/spambayes/contrib In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv16272/contrib Modified Files: sb_culler.py Log Message: Update with the changes that Andrew Dalke posted to c.l.p today. Weed out duplicates. Add SpamAssassin header checking. Add whitelisting of delivered-to header. Add ability to continue after KeyboardInterupt Only restart network after 21 errors, not 1. During delay, let user quit, immediately refilter, or delay for a given time. Print a little indicator while filtering. Make logging subject able to recover from parsing errors. Reload whitelist on demand. Index: sb_culler.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/contrib/sb_culler.py,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** sb_culler.py 27 Oct 2004 02:25:07 -0000 1.2 --- sb_culler.py 27 Oct 2004 02:36:25 -0000 1.3 *************** *** 19,24 **** --- 19,29 ---- be done by editing the code. + The virus identification and POP3 manipulation code is based on Kevin + Altis' virus killer code, which I've been gratefully using for the + last several months. + Written by Andrew Dalke, November 2003. Released into the public domain on 2003/11/22. + Updated 2004/10/26 == NO copyright protection asserted for this code. Share and enjoy! == *************** *** 26,35 **** """ ! import sets, traceback import poplib import posixpath ! from email import Header from spambayes import mboxutils, hammie DO_ACTIONS = 1 VERBOSE_LEVEL = 1 --- 31,43 ---- """ ! import sets, traceback, md5, os import poplib import posixpath ! from email import Header, Utils from spambayes import mboxutils, hammie + import socket + socket.setdefaulttimeout(10) + DO_ACTIONS = 1 VERBOSE_LEVEL = 1 *************** *** 113,128 **** class WhiteListFrom: """Test: Read a list of email addresses to use a 'from' whitelist""" def __init__(self, filename): lines = [line.strip().lower() for line in ! open(filename).readlines()] self.addresses = sets.Set(lines) def __call__(self, mi, log): frm = mi.msg["from"] status = (frm is not None) and (frm.lower() in self.addresses) if status: ! log.pass_test("'from' white list") return "it is in 'from' white list" return False --- 121,186 ---- + class Duplicate: + def __init__(self): + self.unique = {} + def __call__(self, mi, log): + digest = md5.md5(mi.text).digest() + if digest in self.unique: + log.pass_test(SPAM) + return "duplicate" + self.unique[digest] = 1 + return False + + class IllegalDeliveredTo: + def __init__(self, names): + self.names = names + def __call__(self, mi, log): + fields = mi.msg.get_all("Delivered-To") + if fields is None: + return False + + for field in fields: + field = field.lower() + for name in self.names: + if name in field: + return False + log.pass_test(SPAM) + return "sent to random email" + + class SpamAssassin: + def __init__(self, level = 8): + self.level = level + def __call__(self, mi, log): + if ("*" * self.level) in mi.msg.get("X-Spam-Status", ""): + log.pass_test(SPAM) + return "assassinated!" + return False + class WhiteListFrom: """Test: Read a list of email addresses to use a 'from' whitelist""" def __init__(self, filename): + self.filename = filename + self._mtime = 0 + self._load_if_needed() + + def _load(self): lines = [line.strip().lower() for line in ! open(self.filename).readlines()] self.addresses = sets.Set(lines) + + def _load_if_needed(self): + mtime = os.path.getmtime(self.filename) + if mtime != self._mtime: + print "Reloading", self.filename + self._mtime = mtime + self._load() def __call__(self, mi, log): + self._load_if_needed() frm = mi.msg["from"] + realname, frm = Utils.parseaddr(frm) status = (frm is not None) and (frm.lower() in self.addresses) if status: ! log.pass_test(SPAM) return "it is in 'from' white list" return False *************** *** 212,216 **** def _log_subject(mi, log): encoded_subject = mi.msg.get('subject') ! subject, encoding = Header.decode_header(encoded_subject)[0] if encoding is None or encoding == 'iso-8859-1': s = subject --- 270,278 ---- def _log_subject(mi, log): encoded_subject = mi.msg.get('subject') ! try: ! subject, encoding = Header.decode_header(encoded_subject)[0] ! except Header.HeaderParseError: ! log.info("%s Subject cannot be parsed" % (mi.i,)) ! return if encoding is None or encoding == 'iso-8859-1': s = subject *************** *** 230,233 **** --- 292,297 ---- for i in range(1, count+1): + if (i-1) % 10 == 0: + print " == %d/%d ==" % (i, count) # Kevin's code used -1, but -1 doesn't work for one of # my POP accounts, while a million does. *************** *** 298,303 **** try: # Note this this example uses the default password. YMMV. ! urllib.urlopen("http://:admin@192.168.1.1/Gozila.cgi?pppoeAct=2") ! urllib.urlopen("http://:admin@192.168.1.1/Gozila.cgi?pppoeAct=1") except KeyboardInterrupt: raise --- 362,367 ---- try: # Note this this example uses the default password. YMMV. ! urllib.urlopen("http://:admin@192.168.1.1/Gozila.cgi?pppoeAct=2").read() ! urllib.urlopen("http://:admin@192.168.1.1/Gozila.cgi?pppoeAct=1").read() except KeyboardInterrupt: raise *************** *** 329,343 **** filters = Filters() # A list of everyone who has emailed me this year. # Keep their messages on the server. filters.add(WhiteListFrom("good_emails.txt"), KEEP) ! # My mailing lists. Edited to make it slightly harder ! # for spammers to read this description and figure ! # out how to spam me. ! filters.add(WhiteListSubstrings("subject", ! ['[Twisted]', 'CompChem:', '[Bioperl]', ! '[BioPy]', '[SALSA CLUB]', '[Open-bio]', ! '[StarshipCrew]']), KEEP) # Get rid of anything which smells like an exectuable. --- 393,427 ---- filters = Filters() + duplicate = Duplicate() + filters.add(duplicate, AppendFile("spam2.mbox")) + # A list of everyone who has emailed me this year. # Keep their messages on the server. filters.add(WhiteListFrom("good_emails.txt"), KEEP) ! # My mailing lists. ! filters.add(WhiteListSubstrings("subject", [ ! 'ABCD:', ! '[Python-announce]', ! '[Python]', ! '[Bioinfo]', ! '[EuroPython]', ! ]), ! KEEP) ! ! filters.add(WhiteListSubstrings("to", [ ! "president@whitehouse.gov", ! "ceo@big.com", ! ]), ! KEEP) ! ! names = ["john", "", "jon", "johnathan"] ! valid_emails = ([name + "@lectroid.com" for name in names] + ! [name + "@bigboote.org" for name in names] + ! ["buckeroo.bonzai@aol.earth"]) ! ! filters.add(IllegalDeliveredTo(valid_emails), DELETE) ! filters.add(SpamAssassin(), AppendFile("spam2.mbox")) ! # Get rid of anything which smells like an exectuable. *************** *** 349,356 **** filters.add(IsSpam(h, 0.90), AppendFile("spam.mbox")) ! # These are my POP3 accounts. (or not ;) server_configs = [("mail.example.com", ! "dalke", "password"), ! ("mail2.spam.com", "dalke", "1234"), ] # The main culling loop. --- 433,441 ---- filters.add(IsSpam(h, 0.90), AppendFile("spam.mbox")) ! # These are my POP3 accounts. server_configs = [("mail.example.com", ! "user@example.com", "password"), ! ("popserver.big.com", "ceo", "12345"), ] ! # The main culling loop. *************** *** 361,367 **** --- 446,455 ---- while 1: error_flag = False + duplicate.unique.clear() # Hack! for server, user, pwd in server_configs: try: log = filter_server( (server, user, pwd), filters) + except KeyboardInterrupt: + raw_input("Press enter to continue. ") except StandardError: raise *************** *** 406,416 **** error_count += 1 ! if error_count > 20: restart_network() error_count = 0 ! wait(3*60) ! ! if __name__ == "__main__": --- 494,520 ---- error_count += 1 ! if error_count > 0: restart_network() error_count = 0 ! delay = 10 * 60 ! while delay: ! try: ! wait(delay) ! break ! except KeyboardInterrupt: ! print ! while 1: ! cmd = raw_input("enter, delay, or quit? ") ! if cmd in ("q", "quit"): ! raise SystemExit(0) ! elif cmd == "": ! delay = 0 ! break ! elif cmd.isdigit(): ! delay = int(cmd) ! break ! else: ! print "Unknown command." if __name__ == "__main__": From anadelonbrin at users.sourceforge.net Thu Oct 28 06:29:03 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Thu Oct 28 06:29:07 2004 Subject: [Spambayes-checkins] spambayes/Outlook2000/dialogs dialog_map.py, 1.39, 1.40 opt_processors.py, 1.15, 1.16 Message-ID: Update of /cvsroot/spambayes/spambayes/Outlook2000/dialogs In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv28122/Outlook2000/dialogs Modified Files: dialog_map.py opt_processors.py Log Message: Add [ 938992 ] Allow longer background filtering delays. You can now use background filtering delays up to one minute. The slider on the dialog still only goes up to 10 seconds (for fine control), but larger values can be entered via the edit box. Index: dialog_map.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/Outlook2000/dialogs/dialog_map.py,v retrieving revision 1.39 retrieving revision 1.40 diff -C2 -d -r1.39 -r1.40 *** dialog_map.py 28 Apr 2004 22:30:13 -0000 1.39 --- dialog_map.py 28 Oct 2004 04:29:00 -0000 1.40 *************** *** 478,483 **** IDC_DELAY2_TEXT IDC_DELAY2_SLIDER IDC_INBOX_TIMER_ONLY"""), ! (EditNumberProcessor, "IDC_DELAY1_TEXT IDC_DELAY1_SLIDER", "Filter.timer_start_delay", 0, 10, 20), ! (EditNumberProcessor, "IDC_DELAY2_TEXT IDC_DELAY2_SLIDER", "Filter.timer_interval", 0, 10, 20), (BoolButtonProcessor, "IDC_INBOX_TIMER_ONLY", "Filter.timer_only_receive_folders"), (StatsProcessor, "IDC_STATISTICS"), --- 478,483 ---- IDC_DELAY2_TEXT IDC_DELAY2_SLIDER IDC_INBOX_TIMER_ONLY"""), ! (EditNumberProcessor, "IDC_DELAY1_TEXT IDC_DELAY1_SLIDER", "Filter.timer_start_delay", 0, 10, 20, 60), ! (EditNumberProcessor, "IDC_DELAY2_TEXT IDC_DELAY2_SLIDER", "Filter.timer_interval", 0, 10, 20, 60), (BoolButtonProcessor, "IDC_INBOX_TIMER_ONLY", "Filter.timer_only_receive_folders"), (StatsProcessor, "IDC_STATISTICS"), Index: opt_processors.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/Outlook2000/dialogs/opt_processors.py,v retrieving revision 1.15 retrieving revision 1.16 diff -C2 -d -r1.15 -r1.16 *** opt_processors.py 16 Dec 2003 05:06:33 -0000 1.15 --- opt_processors.py 28 Oct 2004 04:29:00 -0000 1.16 *************** *** 168,175 **** class EditNumberProcessor(OptionControlProcessor): ! def __init__(self, window, control_ids, option, min_val = 0, max_val = 100, ticks = 100): self.slider_id = control_ids and control_ids[1] self.min_val = min_val self.max_val = max_val self.ticks = ticks OptionControlProcessor.__init__(self, window, control_ids, option) --- 168,177 ---- class EditNumberProcessor(OptionControlProcessor): ! def __init__(self, window, control_ids, option, min_val=0, max_val=100, ! ticks=100, max_edit_val=100): self.slider_id = control_ids and control_ids[1] self.min_val = min_val self.max_val = max_val + self.max_edit_val = max_edit_val self.ticks = ticks OptionControlProcessor.__init__(self, window, control_ids, option) *************** *** 246,250 **** str_val = buf[:nchars] val = float(str_val) ! if val < self.min_val or val > self.max_val: raise ValueError, "Value must be between %d and %d" % (self.min_val, self.max_val) self.SetOptionValue(val) --- 248,252 ---- str_val = buf[:nchars] val = float(str_val) ! if val < self.min_val or val > self.max_edit_val: raise ValueError, "Value must be between %d and %d" % (self.min_val, self.max_val) self.SetOptionValue(val) From anadelonbrin at users.sourceforge.net Thu Oct 28 07:11:21 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Thu Oct 28 07:11:25 2004 Subject: [Spambayes-checkins] spambayes/spambayes storage.py,1.42,1.43 Message-ID: Update of /cvsroot/spambayes/spambayes/spambayes In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv3439/spambayes Modified Files: storage.py Log Message: Add [ 715248 ] Pickle classifier should save to a temp file first Overkill protection, probably, but last time I brought this up on spambayes-dev, people thought it couldn't hurt, so add it and get the tracker closed. Pickles now save to a temp first file. Once that's done, with *nix the file is (atomically) replaced by the new one. Otherwise the old one is renamed, the new one replaced, and the old one deleted. There should always be a valid copy saved, even if it has the wrong name. Index: storage.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/spambayes/storage.py,v retrieving revision 1.42 retrieving revision 1.43 diff -C2 -d -r1.42 -r1.43 *** storage.py 12 Oct 2004 23:27:29 -0000 1.42 --- storage.py 28 Oct 2004 05:11:19 -0000 1.43 *************** *** 63,66 **** --- 63,67 ---- return not not val + import os import sys import types *************** *** 138,144 **** print >> sys.stderr, 'Persisting',self.db_name,'as a pickle' ! fp = open(self.db_name, 'wb') ! pickle.dump(self, fp, PICKLE_TYPE) ! fp.close() def close(self): --- 139,167 ---- print >> sys.stderr, 'Persisting',self.db_name,'as a pickle' ! # Be as defensive as possible; keep always a safe copy. ! tmp = self.db_name + '.tmp' ! try: ! fp = open(tmp, 'wb') ! pickle.dump(self, fp, PICKLE_TYPE) ! fp.close() ! except IOError, e: ! if options["globals", "verbose"]: ! print 'Failed update: ' + str(e) ! if fp is not None: ! os.remove(tmp) ! raise ! try: ! # With *nix we can just rename, and (as long as permissions ! # are correct) the old file will vanish. With win32, this ! # won't work - the Python help says that there may not be ! # a way to do an atomic replace, so we rename the old one, ! # put the new one there, and then delete the old one. If ! # something goes wrong, there is at least a copy of the old ! # one. ! os.rename(tmp, self.db_name) ! except OSError: ! os.rename(self.db_name, self.db_name + '.bak') ! os.rename(tmp, self.db_name) ! os.remove(self.db_name + '.bak') def close(self): From anadelonbrin at users.sourceforge.net Thu Oct 28 07:45:49 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Thu Oct 28 07:45:52 2004 Subject: [Spambayes-checkins] spambayes/spambayes/test test_sb-server.py, 1.6, 1.7 Message-ID: Update of /cvsroot/spambayes/spambayes/spambayes/test In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv8822/spambayes/test Modified Files: test_sb-server.py Log Message: Changing "Spambayes" to "SpamBayes" killed our test. Fix that. Index: test_sb-server.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/spambayes/test/test_sb-server.py,v retrieving revision 1.6 retrieving revision 1.7 diff -C2 -d -r1.6 -r1.7 *** test_sb-server.py 3 Aug 2004 06:51:00 -0000 1.6 --- test_sb-server.py 28 Oct 2004 05:45:47 -0000 1.7 *************** *** 320,324 **** if not packet: break response += packet ! assert re.search(r"(?s).*Spambayes proxy.*", response) # Kill the proxy and the test server. --- 320,324 ---- if not packet: break response += packet ! assert re.search(r"(?s).*SpamBayes proxy.*", response) # Kill the proxy and the test server. From anadelonbrin at users.sourceforge.net Thu Oct 28 09:23:26 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Thu Oct 28 09:23:30 2004 Subject: [Spambayes-checkins] spambayes/spambayes/test test_smtpproxy.py, 1.2, 1.3 Message-ID: Update of /cvsroot/spambayes/spambayes/spambayes/test In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv25214/spambayes/test Modified Files: test_smtpproxy.py Log Message: This doesn't test all that much yet, but at least it works now (finally!). Closes the last of [ 981970 ] tests failing Index: test_smtpproxy.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/spambayes/test/test_smtpproxy.py,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** test_smtpproxy.py 16 Dec 2003 05:06:34 -0000 1.2 --- test_smtpproxy.py 28 Oct 2004 07:23:24 -0000 1.3 *************** *** 3,8 **** """Test that the SMTP proxy is working correctly. - When using the -z command line option, carries out various tests. - The -t option runs a fake SMTP server on port 8025. This is the same server that the testing option uses, and may be separately run for --- 3,6 ---- *************** *** 20,24 **** """ ! # This module is part of the spambayes project, which is Copyright 2002-3 # The Python Software Foundation and is covered by the Python Software # Foundation license. --- 18,22 ---- """ ! # This module is part of the spambayes project, which is Copyright 2002-4 # The Python Software Foundation and is covered by the Python Software # Foundation license. *************** *** 66,69 **** --- 64,68 ---- import re + import sys import socket import getopt *************** *** 71,90 **** import operator import unittest ! import threading import smtplib ! # We need to import sb_server, but it may not be on the PYTHONPATH. ! # Hack around this, so that if we are running in a cvs-like setup ! # everything still works. ! import os ! import sys ! try: ! this_file = __file__ ! except NameError: ! this_file = sys.argv[0] ! sb_dir = os.path.abspath(os.path.dirname(os.path.dirname(os.path.dirname(this_file)))) ! if sb_dir not in sys.path: ! sys.path.append(sb_dir) ! sys.path.append(os.path.join(sb_dir, "scripts")) from spambayes import Dibbler --- 70,78 ---- import operator import unittest ! import thread import smtplib ! import sb_test_support ! sb_test_support.fix_sys_path() from spambayes import Dibbler *************** *** 92,102 **** from spambayes.Options import options from sb_server import state, _recreateState ! from spambayes.smtpproxy import BayesSMTPProxyListener from spambayes.ProxyUI import ProxyUserInterface from spambayes.UserInterface import UserInterfaceServer class TestListener(Dibbler.Listener): ! """Listener for TestPOP3Server. Works on port 8025, because 8025 ! wouldn't work for Tony.""" def __init__(self, socketMap=asyncore.socket_map): --- 80,90 ---- from spambayes.Options import options from sb_server import state, _recreateState ! from spambayes.smtpproxy import BayesSMTPProxyListener, SMTPTrainer from spambayes.ProxyUI import ProxyUserInterface from spambayes.UserInterface import UserInterfaceServer + from spambayes.classifier import Classifier class TestListener(Dibbler.Listener): ! """Listener for TestSMTPServer.""" def __init__(self, socketMap=asyncore.socket_map): *************** *** 114,118 **** # Grumble: asynchat.__init__ doesn't take a 'map' argument, # hence the two-stage construction. ! Dibbler.BrighterAsyncChat.__init__(self) Dibbler.BrighterAsyncChat.set_socket(self, clientSocket, socketMap) self.set_terminator('\r\n') --- 102,106 ---- # Grumble: asynchat.__init__ doesn't take a 'map' argument, # hence the two-stage construction. ! Dibbler.BrighterAsyncChat.__init__(self, map=socketMap) Dibbler.BrighterAsyncChat.set_socket(self, clientSocket, socketMap) self.set_terminator('\r\n') *************** *** 131,138 **** """Asynchat override.""" self.request = self.request + data - print "data", data def push(self, data): - print "pushing", repr(data) Dibbler.BrighterAsyncChat.push(self, data) --- 119,124 ---- *************** *** 142,146 **** return Dibbler.BrighterAsyncChat.recv(self, buffer_size) except socket.error, e: ! if e[0] == 10053: return '' raise --- 128,132 ---- return Dibbler.BrighterAsyncChat.recv(self, buffer_size) except socket.error, e: ! if e[0] == 10035: return '' raise *************** *** 161,170 **** cooked = handler(self.request[len(cmd):]) if cooked is not None: ! self.push(cooked.strip()) foundCmd = True break if not foundCmd: # Something we don't know about. Assume that it is ok! ! self.push("250 Unknown command ok.\r\n") self.request = '' --- 147,157 ---- cooked = handler(self.request[len(cmd):]) if cooked is not None: ! self.push(cooked) foundCmd = True break if not foundCmd: # Something we don't know about. Assume that it is ok! ! self.push("250 Unknown command %s ok.\r\n" % ! (self.request,)) self.request = '' *************** *** 189,192 **** --- 176,180 ---- return "504 This command should not have got to the server\r\n" return "250 %s... Recipient ok\r\n" % (args.lower(),) + def onData(self, args): self.inData = True *************** *** 198,233 **** that receives mail and discards it.""" def setUp(self): ! # Run a proxy and a test server in separate threads with separate ! # asyncore environments. Don't bother with the UI. ! state.isTest = True ! testServerReady = threading.Event() ! def runTestServer(): ! testSocketMap = {} ! #TestListener(socketMap=testSocketMap) ! testServerReady.set() ! #asyncore.loop(map=testSocketMap) ! ! proxyReady = threading.Event() ! def runProxy(): ! trainer = None ! BayesSMTPProxyListener('localhost', 8025, ('', 8026), trainer) ! proxyReady.set() ! Dibbler.run() ! ! serverThread = threading.Thread(target=runTestServer) ! serverThread.setDaemon(True) ! serverThread.start() ! testServerReady.wait() ! proxyThread = threading.Thread(target=runProxy) ! proxyThread.setDaemon(True) ! proxyThread.start() ! proxyReady.wait() def tearDown(self): ! return ! # Kill the proxy and the test server. ! s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) ! s.connect(('localhost', 8025)) ! s.send("kill\r\n") def test_direct_connection(self): --- 186,193 ---- that receives mail and discards it.""" def setUp(self): ! pass def tearDown(self): ! pass def test_direct_connection(self): *************** *** 265,269 **** proxy.send('quit\r\n') ! def qtest_disconnection(self): proxy = socket.socket(socket.AF_INET, socket.SOCK_STREAM) proxy.connect(('localhost', 8025)) --- 225,229 ---- proxy.send('quit\r\n') ! def test_disconnection(self): proxy = socket.socket(socket.AF_INET, socket.SOCK_STREAM) proxy.connect(('localhost', 8025)) *************** *** 271,277 **** response = proxy.recv(100) except socket.error, e: ! if e[0] == 10035: ! # non-blocking socket so that the recognition ! # can proceed, so this doesn't mean much pass else: --- 231,236 ---- response = proxy.recv(100) except socket.error, e: ! if e[0] == 10053: ! # Socket is dead, which is what we want. pass else: *************** *** 281,287 **** response = proxy.recv(100) except socket.error, e: ! if e[0] == 10035: ! # non-blocking socket so that the recognition ! # can proceed, so this doesn't mean much pass else: --- 240,245 ---- response = proxy.recv(100) except socket.error, e: ! if e[0] == 10053: ! # Socket is dead, which is what we want. pass else: *************** *** 291,300 **** def test_sendmessage(self): ! try: ! s = smtplib.SMTP('localhost', 8026) ! s.sendmail("ta-meyer@ihug.co.nz", "ta-meyer@ihug.co.nz", good1) ! s.quit() ! except: ! self.fail("Couldn't send a message through.") def suite(): --- 249,264 ---- def test_sendmessage(self): ! s = smtplib.SMTP('localhost', 8026) ! s.sendmail("ta-meyer@ihug.co.nz", "ta-meyer@ihug.co.nz", good1) ! s.quit() ! ! def test_ham_intercept(self): ! pre_ham_trained = bayes.nham ! s = smtplib.SMTP('localhost', 8026) ! s.sendmail("ta-meyer@ihug.co.nz", ! options["smtpproxy", "ham_address"], good1) ! s.quit() ! post_ham_trained = bayes.nham ! self.assertEqual(pre_ham_trained+1, post_ham_trained) def suite(): *************** *** 306,315 **** # Read the arguments. try: ! opts, args = getopt.getopt(sys.argv[1:], 'htz') except getopt.error, msg: print >>sys.stderr, str(msg) + '\n\n' + __doc__ sys.exit() - runSelfTest = False for opt, arg in opts: if opt == '-h': --- 270,278 ---- # Read the arguments. try: ! opts, args = getopt.getopt(sys.argv[1:], 'ht') except getopt.error, msg: print >>sys.stderr, str(msg) + '\n\n' + __doc__ sys.exit() for opt, arg in opts: if opt == '-h': *************** *** 319,325 **** state.isTest = True state.runTestServer = True - elif opt == '-z': - state.isTest = True - runSelfTest = True state.createWorkers() --- 282,285 ---- *************** *** 330,335 **** asyncore.loop() else: state.buildServerStrings() ! unittest.main(argv=sys.argv + ['suite']) if __name__ == '__main__': --- 290,309 ---- asyncore.loop() else: + state.isTest = True state.buildServerStrings() ! testSocketMap = {} ! def runTestServer(): ! TestListener(socketMap=testSocketMap) ! asyncore.loop(map=testSocketMap) ! def runProxy(): ! global bayes ! bayes = Classifier() ! trainer = SMTPTrainer(bayes, state) ! BayesSMTPProxyListener('localhost', 8025, ('', 8026), trainer) ! Dibbler.run() ! thread.start_new_thread(runTestServer, ()) ! thread.start_new_thread(runProxy, ()) ! sb_test_support.unittest_main(argv=sys.argv + ['suite']) ! if __name__ == '__main__': From anadelonbrin at users.sourceforge.net Fri Oct 29 02:14:45 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Fri Oct 29 02:14:48 2004 Subject: [Spambayes-checkins] spambayes/spambayes ProxyUI.py, 1.50, 1.51 classifier.py, 1.27, 1.28 tokenizer.py, 1.32, 1.33 Message-ID: Update of /cvsroot/spambayes/spambayes/spambayes In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv16988/spambayes Modified Files: ProxyUI.py classifier.py tokenizer.py Log Message: As I understand it, this gives Python 2.4 users a free speedup. If possible, use the builtin (faster, C-implemented) set class, falling back to sets.Set, then back to our compatsets.Set Index: ProxyUI.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/spambayes/ProxyUI.py,v retrieving revision 1.50 retrieving revision 1.51 diff -C2 -d -r1.50 -r1.51 *** ProxyUI.py 12 Oct 2004 23:44:00 -0000 1.50 --- ProxyUI.py 29 Oct 2004 00:14:42 -0000 1.51 *************** *** 61,67 **** try: ! from sets import Set ! except ImportError: ! from compatsets import Set import tokenizer --- 61,74 ---- try: ! # We have three possibilities for Set: ! # (a) With Python 2.2 and earlier, we use our compatsets class ! # (b) With Python 2.3, we use the sets.Set class ! # (c) With Python 2.4 and later, we use the builtin set class ! Set = set ! except NameError: ! try: ! from sets import Set ! except ImportError: ! from spambayes.compatsets import Set import tokenizer Index: classifier.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/spambayes/classifier.py,v retrieving revision 1.27 retrieving revision 1.28 diff -C2 -d -r1.27 -r1.28 *** classifier.py 20 Oct 2004 22:09:01 -0000 1.27 --- classifier.py 29 Oct 2004 00:14:42 -0000 1.28 *************** *** 41,47 **** import types try: ! from sets import Set ! except ImportError: ! from spambayes.compatsets import Set # XXX At time of writing, these are only necessary for the --- 41,54 ---- import types try: ! # We have three possibilities for Set: ! # (a) With Python 2.2 and earlier, we use our compatsets class ! # (b) With Python 2.3, we use the sets.Set class ! # (c) With Python 2.4 and later, we use the builtin set class ! Set = set ! except NameError: ! try: ! from sets import Set ! except ImportError: ! from spambayes.compatsets import Set # XXX At time of writing, these are only necessary for the Index: tokenizer.py =================================================================== RCS file: /cvsroot/spambayes/spambayes/spambayes/tokenizer.py,v retrieving revision 1.32 retrieving revision 1.33 diff -C2 -d -r1.32 -r1.33 *** tokenizer.py 5 Aug 2004 00:56:53 -0000 1.32 --- tokenizer.py 29 Oct 2004 00:14:42 -0000 1.33 *************** *** 17,23 **** import urllib try: ! from sets import Set ! except ImportError: ! from compatsets import Set from spambayes import classifier --- 17,30 ---- import urllib try: ! # We have three possibilities for Set: ! # (a) With Python 2.2 and earlier, we use our compatsets class ! # (b) With Python 2.3, we use the sets.Set class ! # (c) With Python 2.4 and later, we use the builtin set class ! Set = set ! except NameError: ! try: ! from sets import Set ! except ImportError: ! from spambayes.compatsets import Set from spambayes import classifier From anadelonbrin at users.sourceforge.net Fri Oct 29 03:58:16 2004 From: anadelonbrin at users.sourceforge.net (Tony Meyer) Date: Fri Oct 29 03:58:19 2004 Subject: [Spambayes-checkins] spambayes/languages - New directory Message-ID: Update of /cvsroot/spambayes/spambayes/languages In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv7438/languages Log Message: Directory /cvsroot/spambayes/spambayes/languages added to the repository