[Spambayes-checkins] spambayes/spambayes Options.py,1.6,1.7 hammie.py,1.2,1.3

Tue Jan 21 06:51:05 EST 2003

Update of /cvsroot/spambayes/spambayes/spambayes
In directory sc8-pr-cvs1:/tmp/cvs-serv6690/spambayes

Modified Files:
	Options.py hammie.py 
Log Message:
* hammiefilter now has -t option for filter/train step
* Options has new hammie_train_on_filter and hammie_trained_header options
* hammie.py:Hammie.filter has new train kwarg to support filter/train in
  one step.


Index: Options.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/spambayes/Options.py,v
retrieving revision 1.6
retrieving revision 1.7
diff -C2 -d -r1.6 -r1.7
*** Options.py	17 Jan 2003 22:23:59 -0000	1.6
--- Options.py	21 Jan 2003 14:50:26 -0000	1.7
***************
*** 343,347 ****
  # Name of a debugging header for spambayes hackers, showing the strongest
  # clues that have resulted in the classification in the standard header.
! hammie_debug_header_name: X-Hammie-Debug
  
  # The range of clues that are added to the "debug" header in the E-mail
--- 343,358 ----
  # Name of a debugging header for spambayes hackers, showing the strongest
  # clues that have resulted in the classification in the standard header.
! hammie_debug_header_name: X-Spambayes-Debug
! 
! # Train when filtering?  After filtering a message, hammie can then
! # train itself on the judgement (ham or spam).  This can speed things up
! # with a procmail-based solution.  If you do enable this, please make
! # sure to retrain any mistakes.  Otherwise, your word database will
! # slowly become useless.
! hammie_train_on_filter: False
! 
! # When training on a message, the name of the header to add with how it
! # was trained
! hammie_trained_header: X-Spambayes-Trained
  
  # The range of clues that are added to the "debug" header in the E-mail
***************
*** 463,466 ****
--- 474,479 ----
                 'hammie_debug_header': boolean_cracker,
                 'hammie_debug_header_name': string_cracker,
+                'hammie_train_on_filter': boolean_cracker,
+                'hammie_trained_header': string_cracker,
                 },
      'hammiefilter' : {'hammiefilter_persistent_use_database': boolean_cracker,

Index: hammie.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/spambayes/hammie.py,v
retrieving revision 1.2
retrieving revision 1.3
diff -C2 -d -r1.2 -r1.3
*** hammie.py	14 Jan 2003 05:38:20 -0000	1.2
--- hammie.py	21 Jan 2003 14:50:27 -0000	1.3
***************
*** 62,66 ****
      def filter(self, msg, header=None, spam_cutoff=None,
                 ham_cutoff=None, debugheader=None,
!                debug=None):
          """Score (judge) a message and add a disposition header.
  
--- 62,66 ----
      def filter(self, msg, header=None, spam_cutoff=None,
                 ham_cutoff=None, debugheader=None,
!                debug=None, train=None):
          """Score (judge) a message and add a disposition header.
  
***************
*** 74,77 ****
--- 74,82 ----
          The name of the debugging header is given as 'debugheader'.
  
+         If 'train' is True, also train on the result of scoring the
+         message (ie. train as ham if it's ham, train as spam if it's
+         spam).  You'll want to be very dilligent about retraining
+         mistakes if you use this.
+ 
          All defaults for optional parameters come from the Options file.
  
***************
*** 90,93 ****
--- 95,100 ----
          if debug == None:
              debug = options.hammie_debug_header
+         if train == None:
+             train = options.hammie_train_on_filter
  
          msg = mboxutils.get_message(msg)
***************
*** 98,106 ****
          prob, clues = self._scoremsg(msg, True)
          if prob < ham_cutoff:
!             disp = options.header_ham_string
          elif prob > spam_cutoff:
!             disp = options.header_spam_string
          else:
              disp = options.header_unsure_string
          disp += ("; %."+str(options.header_score_digits)+"f") % prob
          if options.header_score_logarithm:
--- 105,122 ----
          prob, clues = self._scoremsg(msg, True)
          if prob < ham_cutoff:
!             is_spam = False
!             trained = options.header_ham_string
!             disp = trained
          elif prob > spam_cutoff:
!             is_spam = True
!             trained = options.header_spam_string
!             disp = trained
          else:
+             is_spam = False
+             trained = options.header_ham_string
              disp = options.header_unsure_string
+         if train:
+             self.train(msg, is_spam)
+             msg.add_header(options.hammie_trained_header, trained)
          disp += ("; %."+str(options.header_score_digits)+"f") % prob
          if options.header_score_logarithm: