[Spambayes-checkins] spambayes/spambayes Stats.py,1.10,1.11

Tue Dec 21 22:41:52 CET 2004

Update of /cvsroot/spambayes/spambayes/spambayes
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv25155/spambayes

Modified Files:
	Stats.py 
Log Message:
Merge the Outlook2000.oastats.Stats class and the spambayes.Stats.Stats class into
 one that all the applications can use.  Hopefully this combines the best features
 of each, and allows us to make improvements to both much more easily.

The stats are sourced from the messageinfo db (no more need for the pickle the Outlook
 plugin was using - this was only used by CVS users, so they can delete it manually).
  Resetting is done by only counting messages since a certain date, which can be changed
 to the current date/time.

Per-session and persistent (total) stats are supported, in much the same way as the
 Outlook stats have always worked.  The other applications can use these too, now,
 although they don't as yet.  The code is moderately backwards compatible.

The code that generates the stats strings is broken up into three functions now, to
 make it easier to test it (and, hopefully, modify it).

Index: Stats.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/spambayes/Stats.py,v
retrieving revision 1.10
retrieving revision 1.11
diff -C2 -d -r1.10 -r1.11
*** Stats.py	21 Dec 2004 16:21:59 -0000	1.10
--- Stats.py	21 Dec 2004 21:41:49 -0000	1.11
***************
*** 1,5 ****
  #! /usr/bin/env python

! """Stats.py - Spambayes statistics class.

  Classes:
--- 1,5 ----
  #! /usr/bin/env python

! """Stats.py - SpamBayes statistics class.

  Classes:
***************
*** 14,29 ****
      is <wink>.

  To Do:
      o People would like pretty graphs, so maybe that could be done.
      o People have requested time-based statistics - mail per hour,
        spam per hour, and so on.
!     o The possible stats to show are pretty much endless.  Some to
!       consider would be: percentage of mail that is fp/fn/unsure,
!       percentage of mail correctly classified.
      o Suggestions?
- 
  """

! # This module is part of the spambayes project, which is Copyright 2002-4
  # The Python Software Foundation and is covered by the Python Software
  # Foundation license.
--- 14,31 ----
      is <wink>.

+     This class provides information for both the web interface, the
+     Outlook plug-in, and sb_pop3dnd.
+ 
  To Do:
      o People would like pretty graphs, so maybe that could be done.
      o People have requested time-based statistics - mail per hour,
        spam per hour, and so on.
!       Discussion on spambayes-dev indicated that this would be a lot
!       of work for not much gain; however, since we now have some
!       time data stored, it wouldn't be too bad, so maybe it can go in.
      o Suggestions?
  """

! # This module is part of the spambayes project, which is Copyright 2002-5
  # The Python Software Foundation and is covered by the Python Software
  # Foundation license.
***************
*** 38,246 ****
      True, False = 1, 0

  import types

! from spambayes.message import database_type, open_storage

! class Stats(object):
!     class __empty_msg:
!         def __init__(self):
!             self.getDBKey = self.getId
!         def getId(self):
!             return self.id

!     def __init__(self):
!         self.CalculateStats()

      def Reset(self):
!         self.cls_spam = 0
!         self.cls_ham = 0
!         self.cls_unsure = 0
!         self.trn_spam = 0
!         self.trn_ham = 0
!         self.trn_unsure_ham = 0
!         self.trn_unsure_spam = 0
!         self.fp = 0
!         self.fn = 0
!         self.total = 0

!     def CalculateStats(self):
!         self.Reset()
!         nm, typ = database_type()
!         msginfoDB = open_storage(nm, typ)
!         for msg in msginfoDB.db.keys():
!             self.total += 1
!             m = self.__empty_msg()
!             m.id = msg
!             msginfoDB.load_msg(m)
!             if m.c == 's':
                  # Classified as spam.
!                 self.cls_spam += 1
!                 if m.t == False:
                      # False positive (classified as spam, trained as ham)
!                     self.fp += 1
!             elif m.c == 'h':
                  # Classified as ham.
!                 self.cls_ham += 1
!                 if m.t == True:
                      # False negative (classified as ham, trained as spam)
!                     self.fn += 1
!             elif m.c == 'u':
                  # Classified as unsure.
!                 self.cls_unsure += 1
!                 if m.t == False:
!                     self.trn_unsure_ham += 1
!                 elif m.t == True:
!                     self.trn_unsure_spam += 1
!             if m.t == True:
!                 self.trn_spam += 1
!             elif m.t == False:
!                 self.trn_ham += 1

!     def GetStats(self, use_html=True):
!         if self.total == 0:
!             return ["SpamBayes has processed zero messages"]
          chunks = []
          push = chunks.append
!         not_trn_unsure = self.cls_unsure - self.trn_unsure_ham - \
!                          self.trn_unsure_spam
!         if self.cls_unsure:
!             unsure_ham_perc = 100.0 * self.trn_unsure_ham / self.cls_unsure
!             unsure_spam_perc = 100.0 * self.trn_unsure_spam / self.cls_unsure
!             unsure_not_perc = 100.0 * not_trn_unsure / self.cls_unsure
!         else:
!             unsure_ham_perc = 0.0 # Not correct, really!
!             unsure_spam_perc = 0.0 # Not correct, really!
!             unsure_not_perc = 0.0 # Not correct, really!
!         if self.trn_ham:
!             trn_perc_unsure_ham = 100.0 * self.trn_unsure_ham / \
!                                   self.trn_ham
!             trn_perc_fp = 100.0 * self.fp / self.trn_ham
!             trn_perc_ham = 100.0 - (trn_perc_unsure_ham + trn_perc_fp)
!         else:
!             trn_perc_ham = 0.0 # Not correct, really!
!             trn_perc_unsure_ham = 0.0 # Not correct, really!
!             trn_perc_fp = 0.0 # Not correct, really!
!         if self.trn_spam:
!             trn_perc_unsure_spam = 100.0 * self.trn_unsure_spam / \
!                                    self.trn_spam
!             trn_perc_fn = 100.0 * self.fn / self.trn_spam
!             trn_perc_spam = 100.0 - (trn_perc_unsure_spam + trn_perc_fn)
          else:
!             trn_perc_spam = 0.0 # Not correct, really!
!             trn_perc_unsure_spam = 0.0 # Not correct, really!
!             trn_perc_fn = 0.0 # Not correct, really!
!         format_dict = {
!             'num_seen' : self.total,
!             'correct' : self.total - (self.cls_unsure + self.fp + self.fn),
!             'incorrect' : self.cls_unsure + self.fp + self.fn,
!             'unsure_ham_perc' : unsure_ham_perc,
!             'unsure_spam_perc' : unsure_spam_perc,
!             'unsure_not_perc' : unsure_not_perc,
!             'not_trn_unsure' : not_trn_unsure,
!             'trn_total' : (self.trn_ham + self.trn_spam + \
!                            self.trn_unsure_ham + self.trn_unsure_spam),
!             'trn_perc_ham' : trn_perc_ham,
!             'trn_perc_unsure_ham' : trn_perc_unsure_ham,
!             'trn_perc_fp' : trn_perc_fp,
!             'trn_perc_spam' : trn_perc_spam,
!             'trn_perc_unsure_spam' : trn_perc_unsure_spam,
!             'trn_perc_fn' : trn_perc_fn,
!             }
!         format_dict.update(self.__dict__)

!         # Add percentages of everything.
!         for key, val in format_dict.items():
!             perc_key = "perc_" + key
!             if self.total and isinstance(val, types.IntType):
!                 format_dict[perc_key] = 100.0 * val / self.total
!             else:
!                 format_dict[perc_key] = 0.0 # Not correct, really!

!         # Figure out plurals
!         for num, key in [("num_seen", "sp1"),
!                          ("correct", "sp2"),
!                          ("incorrect", "sp3"),
!                          ("fp", "sp4"),
!                          ("fn", "sp5"),
!                          ("trn_unsure_ham", "sp6"),
!                          ("trn_unsure_spam", "sp7"),
!                          ("not_trn_unsure", "sp8"),
!                          ("trn_total", "sp9"),
!                          ]:
!             if format_dict[num] == 1:
!                 format_dict[key] = ''
!             else:
!                 format_dict[key] = 's'
!         for num, key in [("correct", "wp1"),
!                          ("incorrect", "wp2"),
!                          ("not_trn_unsure", "wp3"),
!                          ]:
!             if format_dict[num] == 1:
!                 format_dict[key] = 'was'
!             else:
!                 format_dict[key] = 'were'
!         # Possibly use HTML for breaks/tabs.
          if use_html:
-             format_dict["br"] = "<br/>"
              format_dict["tab"] = "&nbsp;&nbsp;&nbsp;&nbsp;"
          else:
-             format_dict["br"] = "\r\n"
              format_dict["tab"] = "\t"

! ##        Our result should look something like this:
! ##        (devised by Mark Moraes and Kenny Pitt)
! ##
! ##        SpamBayes has classified a total of 1223 messages:
! ##            827 ham (67.6% of total)
! ##            333 spam (27.2% of total)
! ##            63 unsure (5.2% of total)
! ##
! ##        1125 messages were classified correctly (92.0% of total)
! ##        35 messages were classified incorrectly (2.9% of total)
! ##            0 false positives (0.0% of total)
! ##            35 false negatives (2.9% of total)
! ##
! ##        6 unsures trained as ham (9.5% of unsures)
! ##        56 unsures trained as spam (88.9% of unsures)
! ##        1 unsure was not trained (1.6% of unsures)
! ##
! ##        A total of 760 messages have been trained:
! ##            346 ham (98.3% ham, 1.7% unsure, 0.0% false positives)
! ##            414 spam (78.0% spam, 13.5% unsure, 8.5% false negatives)

-         push("SpamBayes has classified a total of " \
-              "%(num_seen)d message%(sp1)s:" \
-              "%(br)s%(tab)s%(cls_ham)d " \
-              "(%(perc_cls_ham).0f%% of total) good" \
-              "%(br)s%(tab)s%(cls_spam)d " \
-              "(%(perc_cls_spam).0f%% of total) spam" \
-              "%(br)s%(tab)s%(cls_unsure)d " \
-              "(%(perc_cls_unsure).0f%% of total) unsure." % \
-              format_dict)
-         push("%(correct)d message%(sp2)s %(wp1)s classified correctly " \
-              "(%(perc_correct).0f%% of total)" \
-              "%(br)s%(incorrect)d message%(sp3)s %(wp2)s classified " \
-              "incorrectly " \
-              "(%(perc_incorrect).0f%% of total)" \
-              "%(br)s%(tab)s%(fp)d false positive%(sp4)s " \
-              "(%(perc_fp).0f%% of total)" \
-              "%(br)s%(tab)s%(fn)d false negative%(sp5)s " \
-              "(%(perc_fn).0f%% of total)" % \
-              format_dict)
-         push("%(trn_unsure_ham)d unsure%(sp6)s trained as good " \
-              "(%(unsure_ham_perc).0f%% of unsures)" \
-              "%(br)s%(trn_unsure_spam)d unsure%(sp7)s trained as spam " \
-              "(%(unsure_spam_perc).0f%% of unsures)" \
-              "%(br)s%(not_trn_unsure)d unsure%(sp8)s %(wp3)s not trained " \
-              "(%(unsure_not_perc).0f%% of unsures)" % \
-              format_dict)
-         push("A total of %(trn_total)d message%(sp9)s have been trained:" \
-              "%(br)s%(tab)s%(trn_ham)d good " \
-              "(%(trn_perc_ham)0.f%% good, %(trn_perc_unsure_ham)0.f%% " \
-              "unsure, %(trn_perc_fp).0f%% false positives)" \
-              "%(br)s%(tab)s%(trn_spam)d spam " \
-              "(%(trn_perc_spam)0.f%% spam, %(trn_perc_unsure_spam)0.f%% " \
-              "unsure, %(trn_perc_fn).0f%% false negatives)" % \
-              format_dict)
          return chunks

--- 40,349 ----
      True, False = 1, 0

+ import time
  import types

! from spambayes.message import STATS_START_KEY
! from spambayes.message import database_type, open_storage, Message

! try:
!     _
! except NameError:
!     _ = lambda arg: arg

! class Stats(object):
!     def __init__(self, spam_threshold, unsure_threshold, messageinfo_db,
!                  ham_string, unsure_string, spam_string):
!         self.messageinfo_db = messageinfo_db
!         self.spam_threshold = spam_threshold
!         self.unsure_threshold = unsure_threshold
!         self.ham_string = ham_string
!         self.unsure_string = unsure_string
!         self.spam_string = spam_string
!         # Reset session stats.
!         self.Reset()
!         # Load persistent stats.
!         self.from_date = self.messageinfo_db.get_statistics_start_date()
!         self.CalculatePersistentStats()

      def Reset(self):
!         self.num_ham = self.num_spam = self.num_unsure = 0
!         self.num_trained_spam = self.num_trained_spam_fn  = 0
!         self.num_trained_ham = self.num_trained_ham_fp = 0

!     def ResetTotal(self, permanently=False):
!         self.totals = {}
!         for stat in ["num_ham", "num_spam", "num_unsure",
!                      "num_trained_spam", "num_trained_spam_fn",
!                      "num_trained_ham", "num_trained_ham_fp",]:
!             self.totals[stat] = 0
!         if permanently:
!             # Reset the date.
!             self.from_date = time.time()
!             self.messageinfo_db.set_statistics_start_date(self.from_date)
! 
!     def RecordClassification(self, score):
!         if score >= self.spam_threshold:
!             self.num_spam += 1
!         elif score >= self.unsure_threshold:
!             self.num_unsure += 1
!         else:
!             self.num_ham += 1
! 
!     def RecordTraining(self, as_ham, old_score):
!         if as_ham:
!             self.num_trained_ham += 1
!             # If we are recovering an item that is in the "spam" threshold,
!             # then record it as a "false positive"
!             if old_score > self.spam_threshold:
!                 self.num_trained_ham_fp += 1
!         else:
!             self.num_trained_spam += 1
!             # If we are deleting as Spam an item that was in our "good"
!             # range, then record it as a false negative.
!             if old_score < self.unsure_threshold:
!                 self.num_trained_spam_fn += 1
! 
!     def CalculatePersistentStats(self):
!         """Calculate the statistics totals (i.e. not this session).
! 
!         This is done by running through the messageinfo database and
!         adding up the various information.  This could get quite time
!         consuming if the messageinfo database gets very large, so
!         some consideration should perhaps be made about what to do
!         then.
!         """
!         self.ResetTotal()
!         totals = self.totals
!         for msg_id in self.messageinfo_db.db.keys():
!             # Skip the date key.
!             if msg_id == STATS_START_KEY:
!                 continue
!             m = Message(msg_id, self.messageinfo_db)
!             self.messageinfo_db.load_msg(m)
! 
!             # Skip ones that are too old.
!             if self.from_date and m.date_modified and \
!                m.date_modified > self.from_date:
!                 continue
! 
!             classification = m.GetClassification()
!             trained = m.GetTrained()
!             
!             if classification == self.spam_string:
                  # Classified as spam.
!                 totals["num_spam"] += 1
!                 if trained == False:
                      # False positive (classified as spam, trained as ham)
!                     totals["num_trained_ham_fp"] += 1
!             elif classification == self.ham_string:
                  # Classified as ham.
!                 totals["num_ham"] += 1
!                 if trained == True:
                      # False negative (classified as ham, trained as spam)
!                     totals["num_trained_spam_fn"] += 1
!             elif classification == self.unsure_string:
                  # Classified as unsure.
!                 totals["num_unsure"] += 1
!                 if trained == False:
!                     totals["num_trained_ham"] += 1
!                 elif trained == True:
!                     totals["num_trained_spam"] += 1

!     def _CombineSessionAndTotal(self):
!         totals = self.totals
!         num_seen = self.num_ham + self.num_spam + self.num_unsure + \
!                    totals["num_ham"] + totals["num_spam"] + \
!                    totals["num_unsure"]
!         num_ham = self.num_ham + totals["num_ham"]
!         num_spam = self.num_spam + totals["num_spam"]
!         num_unsure = self.num_unsure + totals["num_unsure"]
!         num_trained_ham = self.num_trained_ham + totals["num_trained_ham"]
!         num_trained_ham_fp = self.num_trained_ham_fp + \
!                              totals["num_trained_ham_fp"]
!         num_trained_spam = self.num_trained_spam + \
!                            totals["num_trained_spam"]
!         num_trained_spam_fn = self.num_trained_spam_fn + \
!                               totals["num_trained_spam_fn"]
!         return locals()
! 
!     def _CalculateAdditional(self, data):
!         data["perc_ham"] = 100.0 * data["num_ham"] / data["num_seen"]
!         data["perc_spam"] = 100.0 * data["num_spam"] / data["num_seen"]
!         data["perc_unsure"] = 100.0 * data["num_unsure"] / data["num_seen"]
!         data["num_ham_correct"] = data["num_ham"] - \
!                                   data["num_trained_spam_fn"]
!         data["num_spam_correct"] = data["num_spam"] - \
!                                    data["num_trained_ham_fp"]
!         data["num_correct"] = data["num_ham_correct"] + \
!                               data["num_spam_correct"]
!         data["num_incorrect"] = data["num_trained_spam_fn"] + \
!                                 data["num_trained_ham_fp"]
!         data["perc_correct"] = 100.0 * data["num_correct"] / \
!                                data["num_seen"]
!         data["perc_incorrect"] = 100.0 * data["num_incorrect"] / \
!                                  data["num_seen"]
!         data["perc_fp"] = 100.0 * data["num_trained_ham_fp"] / \
!                           data["num_seen"]
!         data["perc_fn"] = 100.0 * data["num_trained_spam_fn"] / \
!                           data["num_seen"]
!         data["num_unsure_trained_ham"] = data["num_trained_ham"] - \
!                                          data["num_trained_ham_fp"]
!         data["num_unsure_trained_spam"] = data["num_trained_spam"] - \
!                                           data["num_trained_spam_fn"]
!         data["num_unsure_not_trained"] = data["num_unsure"] - \
!                                          data["num_unsure_trained_ham"] - \
!                                          data["num_unsure_trained_spam"]
!         if data["num_unsure"]:
!             data["perc_unsure_trained_ham"] = 100.0 * \
!                                               data["num_unsure_trained_ham"] / \
!                                               data["num_unsure"]
!             data["perc_unsure_trained_spam"] = 100.0 * \
!                                                data["num_unsure_trained_spam"] / \
!                                                data["num_unsure"]
!             data["perc_unsure_not_trained"] = 100.0 * \
!                                               data["num_unsure_not_trained"] / \
!                                               data["num_unsure"]
!         data["total_ham"] = data["num_ham_correct"] + \
!                             data["num_trained_ham"]
!         data["total_spam"] = data["num_spam_correct"] + \
!                              data["num_trained_spam"]
!         if data["total_ham"]:
!             data["perc_ham_incorrect"] = 100.0 * \
!                                          data["num_trained_ham_fp"] / \
!                                          data["total_ham"]
!             data["perc_ham_unsure"] = 100.0 * \
!                                       data["num_unsure_trained_ham"] / \
!                                       data["total_ham"]
!             data["perc_ham_incorrect_or_unsure"] = \
!                 100.0 * (data["num_trained_ham_fp"] +
!                          data["num_unsure_trained_ham"]) / \
!                          data["total_ham"]
!         if data["total_spam"]:
!             data["perc_spam_correct"] = 100.0 * data["num_spam_correct"] / \
!                                         data["total_spam"]
!             data["perc_spam_unsure"] = 100.0 * \
!                                        data["num_unsure_trained_spam"] / \
!                                        data["total_spam"]
!             data["perc_spam_correct_or_unsure"] = \
!                 100.0 * (data["num_spam_correct"] + \
!                          data["num_unsure_trained_spam"]) / \
!                          data["total_spam"]
!         return data
! 
!     def _AddPercentStrings(self, data, dp):
!         data["perc_ham_s"] = "%%(perc_ham).%df%%(perc)s" % (dp,)
!         data["perc_spam_s"] = "%%(perc_spam).%df%%(perc)s" % (dp,)
!         data["perc_unsure_s"] = "%%(perc_unsure).%df%%(perc)s" % (dp,)
!         data["perc_correct_s"] = "%%(perc_correct).%df%%(perc)s" % (dp,)
!         data["perc_incorrect_s"] = "%%(perc_incorrect).%df%%(perc)s" % (dp,)
!         data["perc_fp_s"] = "%%(perc_fp).%df%%(perc)s" % (dp,)
!         data["perc_fn_s"] = "%%(perc_fn).%df%%(perc)s" % (dp,)
!         data["perc_spam_correct_s"] = "%%(perc_spam_correct).%df%%(perc)s" \
!                                       % (dp,)
!         data["perc_spam_unsure_s"] = "%%(perc_spam_unsure).%df%%(perc)s" \
!                                      % (dp,)
!         data["perc_spam_correct_or_unsure_s"] = \
!               "%%(perc_spam_correct_or_unsure).%df%%(perc)s" % (dp,)
!         data["perc_ham_incorrect_s"] = "%%(perc_ham_incorrect).%df%%(perc)s" \
!                                        % (dp,)
!         data["perc_ham_unsure_s"] = "%%(perc_ham_unsure).%df%%(perc)s" \
!                                     % (dp,)
!         data["perc_ham_incorrect_or_unsure_s"] = \
!               "%%(perc_ham_incorrect_or_unsure).%df%%(perc)s" % (dp,)
!         data["perc_unsure_trained_ham_s"] = \
!               "%%(perc_unsure_trained_ham).%df%%(perc)s" % (dp,)
!         data["perc_unsure_trained_spam_s"] = "%%(perc_unsure_trained_spam).%df%%(perc)s" \
!                                              % (dp,)
!         data["perc_unsure_not_trained_s"] = "%%(perc_unsure_not_trained).%df%%(perc)s" \
!                                             % (dp,)
!         data["perc"] = "%"
!         return data
! 
!     def GetStats(self, use_html=False, session_only=False, decimal_points=1):
!         """Return a description of the statistics.
! 
!         If session_only is True, then only a description of the statistics
!         since we were last reset.  Otherwise, lifetime statistics (i.e.
!         those including the ones loaded).
! 
!         Users probably care most about persistent statistics, so present
!         those by default.  If session-only stats are desired, then a
!         special call to here can be made.
! 
!         The percentages will be accurate to the given number of decimal
!         points.
! 
!         If use_html is True, then the returned data is marked up with
!         appropriate HTML, otherwise it is plain text.
!         """
          chunks = []
          push = chunks.append
! 
!         if session_only:
!             data = {}
!             data["num_seen"] = self.num_ham + self.num_spam + \
!                                self.num_unsure
!             data["num_ham"] = self.num_ham
!             data["num_spam"] = self.num_spam
!             data["num_unsure"] = self.num_unsure
!             data["num_trained_ham"] = self.num_trained_ham
!             data["num_trained_ham_fp"] = self.num_trained_ham_fp
!             data["num_trained_spam"] = self.num_trained_spam
!             data["num_trained_spam_fn"] = self.num_trained_spam_fn
          else:
!             data = self._CombineSessionAndTotal()

!         push(_("Messages classified: %d" % (data["num_seen"],)))
!         if data["num_seen"] == 0:
!             return chunks

!         data = self._CalculateAdditional(data)
!         format_dict = self._AddPercentStrings(data, decimal_points)
!         
!         # Possibly use HTML for tabs.
          if use_html:
              format_dict["tab"] = "&nbsp;&nbsp;&nbsp;&nbsp;"
          else:
              format_dict["tab"] = "\t"

!         push((_("%(tab)sGood:%(tab)s%(num_ham)d (%(perc_ham_s)s)") \
!              % format_dict) % format_dict)
!         push((_("%(tab)sSpam:%(tab)s%(num_spam)d (%(perc_spam_s)s)") \
!              % format_dict) % format_dict)
!         push((_("%(tab)sUnsure:%(tab)s%(num_unsure)d (%(perc_unsure_s)s)") \
!              % format_dict) % format_dict)
!         push("")
! 
!         push((_("Classified correctly:%(tab)s%(num_correct)d (%(perc_correct_s)s of total)") \
!              % format_dict) % format_dict)
!         push((_("Classified incorrectly:%(tab)s%(num_incorrect)d (%(perc_incorrect_s)s of total)") \
!              % format_dict) % format_dict)
!         if format_dict["num_incorrect"]:
!             push((_("%(tab)sFalse positives:%(tab)s%(num_trained_ham_fp)d (%(perc_fp_s)s of total)") \
!                  % format_dict) % format_dict)
!             push((_("%(tab)sFalse negatives:%(tab)s%(num_trained_spam_fn)d (%(perc_fn_s)s of total)") \
!                  % format_dict) % format_dict)
!         push("")
!         
!         push(_("Manually classified as good:%(tab)s%(num_trained_ham)d") % format_dict)
!         push(_("Manually classified as spam:%(tab)s%(num_trained_spam)d") % format_dict)
!         push("")
! 
!         if format_dict["num_unsure"]:
!             push((_("Unsures trained as good:%(tab)s%(num_unsure_trained_ham)d (%(perc_unsure_trained_ham_s)s of unsures)") \
!                  % format_dict) % format_dict)
!             push((_("Unsures trained as spam:%(tab)s%(num_unsure_trained_spam)d (%(perc_unsure_trained_spam_s)s of unsures)") \
!                  % format_dict) % format_dict)
!             push((_("Unsures not trained:%(tab)s%(tab)s%(num_unsure_not_trained)d (%(perc_unsure_not_trained_s)s of unsures)") \
!                  % format_dict) % format_dict)
!             push("")
! 
!         if format_dict["total_spam"]:
!             push((_("Spam correctly identified:%(tab)s%(perc_spam_correct_s)s (+ %(perc_spam_unsure_s)s unsure)") \
!                  % format_dict) % format_dict)
!         if format_dict["total_ham"]:
!             push((_("Good incorrectly identified:%(tab)s%(perc_ham_incorrect_s)s (+ %(perc_ham_unsure_s)s unsure)") \
!                  % format_dict) % format_dict)

          return chunks