[Spambayes-checkins] spambayes/contrib tte.py,1.13,1.14

Skip Montanaro montanaro at users.sourceforge.net
Tue Aug 17 19:05:00 CEST 2004


Update of /cvsroot/spambayes/spambayes/contrib
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv28787

Modified Files:
	tte.py 
Log Message:
Seems better to try and alternate ham/spam scoring instead of scoring all
the hams in a batch and all the spams.  After implementing the ratio stuff I
began to have problems.  I think the "score all the hams then all the spams"
in a chunk was the cause.


Index: tte.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/contrib/tte.py,v
retrieving revision 1.13
retrieving revision 1.14
diff -C2 -d -r1.13 -r1.14
*** tte.py	26 Jul 2004 02:46:49 -0000	1.13
--- tte.py	17 Aug 2004 17:04:42 -0000	1.14
***************
*** 144,169 ****
                  sys.stdout.flush()
  
!                 for ham in hams:
!                     score = store.spamprob(tokenize(ham))
!                     selector = ham["message-id"] or ham["subject"]
!                     if score > ham_cutoff and selector is not None:
!                         if verbose:
!                             print >> sys.stderr, "miss ham: %.6f %s" % (
!                                 score, selector)
!                         hmisses += 1
!                         tdict[ham["message-id"]] = True
!                         store.learn(tokenize(ham), False)
  
!                 for spam in spams:
!                     score = store.spamprob(tokenize(spam))
!                     selector = (spam["message-id"] or
!                                 spam["subject"])
!                     if score < spam_cutoff and selector is not None:
!                         if verbose:
!                             print >> sys.stderr, "miss spam: %.6f %s" % (
!                                 score, selector)
!                         smisses += 1
!                         tdict[spam["message-id"]] = True
!                         store.learn(tokenize(spam), True)
  
          except StopIteration:
--- 144,170 ----
                  sys.stdout.flush()
  
!                 for (ham, spam) in map(None, hams, spams):
!                     if ham is not None:
!                         score = store.spamprob(tokenize(ham))
!                         selector = ham["message-id"] or ham["subject"]
!                         if score > ham_cutoff and selector is not None:
!                             if verbose:
!                                 print >> sys.stderr, "miss ham: %.6f %s" % (
!                                     score, selector)
!                             hmisses += 1
!                             tdict[ham["message-id"]] = True
!                             store.learn(tokenize(ham), False)
  
!                     if spam is not None:
!                         score = store.spamprob(tokenize(spam))
!                         selector = (spam["message-id"] or
!                                     spam["subject"])
!                         if score < spam_cutoff and selector is not None:
!                             if verbose:
!                                 print >> sys.stderr, "miss spam: %.6f %s" % (
!                                     score, selector)
!                             smisses += 1
!                             tdict[spam["message-id"]] = True
!                             store.learn(tokenize(spam), True)
  
          except StopIteration:



More information about the Spambayes-checkins mailing list