[Spambayes-checkins] spambayes CHANGELOG.txt, 1.28, 1.29 WHAT_IS_NEW.txt, 1.21, 1.22

Tony Meyer anadelonbrin at users.sourceforge.net
Sun Dec 28 20:46:45 EST 2003


Update of /cvsroot/spambayes/spambayes
In directory sc8-pr-cvs1:/tmp/cvs-serv18519

Modified Files:
	CHANGELOG.txt WHAT_IS_NEW.txt 
Log Message:
Start bringing the changelog and what's new file up to date.  Many gaps still!


Index: CHANGELOG.txt
===================================================================
RCS file: /cvsroot/spambayes/spambayes/CHANGELOG.txt,v
retrieving revision 1.28
retrieving revision 1.29
diff -C2 -d -r1.28 -r1.29
*** CHANGELOG.txt	5 Nov 2003 13:06:56 -0000	1.28
--- CHANGELOG.txt	29 Dec 2003 01:46:43 -0000	1.29
***************
*** 1,6 ****
  [Note that all dates are in English, not American format - i.e. day/month/year]
  
! Release 1.1a1
! =============
  Anthony Baxter    05/11/2003  Spell-checked all the HTML and txt files <wink>
  Tony Meyer        30/10/2003  Implement [ 827138 ] Can't display clues/tokens/source for a trained message
--- 1,61 ----
  [Note that all dates are in English, not American format - i.e. day/month/year]
  
! Alpha Release 8
! ===============
! Tim Peters        29/12/2003  Many improvements to the mksets.py testtools script.
! Tim Peters        28/12/2003  Outlook: export.py - the -n option now gives the number of Set subdirectories desired, instead of a number of msgs per Set subdir "to shoot for".
! Tim Peters        28/12/2003  Added a new -t option to rebal.py, may have broken -s and -r options.
! Tim Peters        26/12/2003  Many improvements to the rebal.py testtools script.
! Tim Peters        26/12/2003  Many improvements to the export.py script for Outlook.
! Skip Montanaro    24/12/2003  storage: make state key a manifest constant
! Mark Hammond      23/12/2003  Tray app: Binary version failed to check for most recent version.
! Skip Montanaro    23/12/2003  Sendmail annotates the Received: header with "(may be forged)" if it thinks the sender is forging its identity.  Generate a token for this, if we are mining received headers.
! Tony Meyer        22/12/2003  Move OE specific stuff out from UserInterface.py to oe_mailbox.py.
! Mark Hammond      22/12/2003  Outlook: Default to background filtering being on for new versions.
! Mark Hammond      21/12/2003  Outlook: DWhen doing a "batch train" (eg, selecting multiple messages and saying "Delete as" or "Recover from",) the DB was saved in between each and every message.  Now only saved at the end (which was always the intent)
! Mark Hammond      21/12/2003  Outlook: DAs part of checking our configuration is invalid, make sure the user hasn't set us up such that either Spam/Unsure folders isn't also being watched for new messages
! Skip Montanaro    20/12/2003  Tokenizer: Solved the "backwards breakdown" problem with ip addresses in Received: headers.
! Skip Montanaro    20/12/2003  Tokenizer: Tightened up recognition of hostnames and accepted bracketed or parenthesized ip addresses without requiring a leading space.
! Mark Hammond      19/12/2003  Outlook: Remove handling of E_OBJECT_CHANGED exception, as it simply did not work.
! Mark Hammond      19/12/2003  Fix [ 803798 ] MAPI_E_OBJECT_CHANGED error saving spam score, which is a dupe of [787676], which was incorrectly marked as fixed
! Mark Hammond      19/12/2003  Outlook: Don't record in the training database unless we are successful in the filter - otherwise future attempts to filter will get all screwed up, as it will think it already was
! Mark Hammond      19/12/2003  Outlook: Move some of our init code from OnConnection to OnStartupComplete
! Tony Meyer        18/12/2003  Bring pspam into the modern SpamBayes world.
! Tony Meyer        17/12/2003  Add the basis of a new experimental (and highly debatable) option to 'slurp' URLs.
! Tim Peters        17/12/2003  Implemented the intended "tiling" version of x-use_bigrams.
! Tony Meyer        16/12/2003  Option names are always case insensitive, no matter what.
! Tony Meyer        16/12/2003  Fix a bug in the web interface where the probability would be incorrectly calculated on 'show clues'.
! Tony Meyer        16/12/2003  New experimental option: x-use_bigrams.
! Skip Montanaro    16/12/2003  mboxutils: This change generalizes the DirOfTxtFileMailbox class to assume all non-directory files contain a single message and to recursively descend into subdirectories of the argument directory.
! Tony Meyer        15/12/2003  Add a warning as a temporary solution for Python bug #845560.
! Tony Meyer        15/12/2003  Add the missing code for the Habeas headers tokenizing (and deprecate).
! Mark Hammond      15/12/2003  Fix [ 833439 ] default_bayes_customize.ini is confusing.
! Tim Peters        14/12/2003  Removed support code for the defunct experimental_ham_spam_imbalance_adjustment option
! Mark Hammond      14/12/2003  Fix [ 856628 ] reload(Options) fails in windows binary
! Mark Hammond      14/12/2003  Fix [ 859215 ] "Restore Defaults" causes assertion error at exit.
! Tony Meyer        14/12/2003  ImapUI: When logging in was done by the UI (to show available folders) we assigned the imap_session object to the wrong name
! Mark Hammond      10/12/2003  Outlook: Try and add the Spam field to the 'Unsure' folder in the same way we do for the Spam and watch folders.
! Tony Meyer        04/12/2003  Tray app: Change the default (double-click) behaviour of the tray to "review messages" rather than "display information".
! Tony Meyer        04/12/2003  Tray app: use SetDefaultItem (so the default action is in bold in the menu).
! Mark Hammond      02/12/2003  sb_server was ignoring command-line options; fix.
! Richie Hindle     27/11/2003  Sjoerd's improved version of patch 831388.
! Neale Pickett     27/11/2003  sb_filter now prints each message only once, not once per argument :)
! Tony Meyer        26/11/2003  sb_dbexpimp.py: Import/Export data as utf-8.
! Richie Hindle     26/11/2003  UserInterface.py More robust code for parsing score headers - copes with the presence of logarithms.
! Richie Hindle     26/11/2003  UserInterface.py: More robust code for parsing evidence headers.  Copes with ';' and ': ' being part of a clue.
! Richie Hindle     26/11/2003  Patch [ 831388 ]: Make message.py respect the header_score_digits option.
! Richie Hindle     26/11/2003  Made sb_filter obey the notate_to and notate_subject options.
! Tony Meyer        26/11/2003  As we now use whichdb to figure out what type of file the db is, if we were using windows and python 2.2 we would try and use dbhash instead of db3hash, which is a Bad Thing. Fix this.
! Tony Meyer        26/11/2003  message.py: Encode words in the evidence header as utf-8 if they are unicode.
! Barry A. Warsaw   25/11/2003  sb_xmlrpcserver.py: Make sure that the socket being bound is reusable.
! Barry A. Warsaw   25/11/2003  Change XMLHammie.score() so that the float score is returned directly instead of trying to be wrapped in a Binary object.
! Barry A. Warsaw   25/11/2003  New script: sb_evoscore.py - A shim script between sb_xmlrpcserver.py and Ximian Evolution.
! Skip Montanaro    25/11/2003  Added a makefile to the testtools directory to make using timcv easier.
! Richie Hindle     16/11/2003  Patch [ 842464 ] Correct installation instructions from "setup.py install" to "python setup.py install"
! Skip Montanaro    13/11/2003  sb_filter: add -o/--option command line arg that allows user to set any options value from the command line
! Skip Montanaro    13/11/2003  OptionsClass: Add set_from_cmdline()
! Skip Montanaro    13/11/2003  sb_filter: Allow multiple types of mailboxes to be processed using mboxutils.getmbox. If any mailbox files are given on the command line, the output is always a Unix-style mailbox containing From_ lines.
! Richie Hindle     11/11/2003  notesfilter: The header_x_string options now live in the Headers section, not the Hammie section.
! Richie Hindle     07/11/2003  Fixed an infinite loop when you break the browser connection to sb_server when sb_server is busy training.
  Anthony Baxter    05/11/2003  Spell-checked all the HTML and txt files <wink>
  Tony Meyer        30/10/2003  Implement [ 827138 ] Can't display clues/tokens/source for a trained message
***************
*** 36,42 ****
  Tony Meyer        20/09/2003  Add an advanced word query to the web UI.
  Tony Meyer        20/09/2003  Make the review messages page on the web UI more customizable.
- 
- 1.0 Releases
- ************
  
  Alpha Release 7
--- 91,94 ----

Index: WHAT_IS_NEW.txt
===================================================================
RCS file: /cvsroot/spambayes/spambayes/WHAT_IS_NEW.txt,v
retrieving revision 1.21
retrieving revision 1.22
diff -C2 -d -r1.21 -r1.22
*** WHAT_IS_NEW.txt	5 Dec 2003 04:45:44 -0000	1.21
--- WHAT_IS_NEW.txt	29 Dec 2003 01:46:43 -0000	1.22
***************
*** 10,14 ****
  noted in the "Transition" section.
  
! New in Alpha Release 7
  ======================
  
--- 10,14 ----
  noted in the "Transition" section.
  
! New in Alpha Release 8
  ======================
  
***************
*** 17,137 ****
  --------------------------
  
!  o If you are using a pickle for storage, your 'message info' database
!    would previously still have been a dbm (where available).  This is
!    no longer the case - if you are using a pickle for the statistics
!    database, you have a pickle for everything.  Your old 'message info'
!    database is not converted (and there is no utility provided to do so).
!    If you are using a pickle for storage, you should delete your old
!    'spambayes.messageinfo.db' file before restarting after the upgrade.
!    You should not suffer any ill effects from this, *unless* you are
!    using sb_imapfilter.py.  In that case, you will find that the filter
!    trains and classifies all messages in the folders it examines, even
!    if it has seen them before - this will only occur once, however.
! 
! There should be no other incompatible changes (from 1.0a6) in this release.
! 
!  o The scripts have all moved (in the archive), and their names have been
!    changed.  If you run "setup.py install", it will offer to remove the old
!    ones for you, which we recommend.  In the archive, the scripts are all
!    in a 'scripts' directory, and all the scripts start with the "sb_"
!    prefix, to avoid clashing with similarly named scripts from other
!    packages.  Some name changes go further - "pop3proxy" is now named
!    "sb_server", "hammiefilter" is now named "sb_filter", "hammiecli" is now
!    named "sb_client", "hammiesrv" is now named "sb_xmlrpcserver", "proxytee"
!    is now named "sb_upload", and the experimental "overkill" script is now
!    named "sb_pop3dnd".
! 
! If you were previously using the "hammie.py" script, you will notice that
! it is no longer available.  We recommend that you use either "sb_filter"
! (probably with "sb_mboxtrain"), or use "sb_server" and "sb_upload".  If you
! wish to continue as you were, you can use the "hammie.py" module, which
! will be installed in the "spambayes" package directory, in the same way you
! used the old "hammie.py" script.
! 
!  o All the backwards compatibility code for options which changed names has
!    been removed, which means that you *must* use the correct (new) names.
!    A script (sb_chkopts) is provided which, if you run it, will inform you
!    if you have any invalid names (if will not output anything if there are
!    no problems).
  
! In addition, the values taken by some options have changed, so if you're
! upgrading from a previous version, you may need to update your configuration
! file (.spambayesrc or bayescustomize.ini)
  
-  o The options to put the classification in the subject or recipient list
-    (notate_to and notate_subject) have moved from the "pop3proxy" section
-    to the "Headers" section.
-  o All the "pop3proxy" storage options (where the cache is stored, the
-    number of days before messages expire, and so on) have moved to the
-    "Storage" section.
-  o The "hammie" debug header options have been removed, and you should use
-    the "Headers" evidence header options instead.
  
! Note that pop3proxy (sb_server) and imapfilter users can simply use the web
! interface to check their options and correct any that are wrong.  All
! incorrectly named options in the configuration file will be removed.
  
  Outlook Plugin
  --------------
!  o Change the default for the ham/spam imbalance adjustment option to
!    False - this should make misclassifications for those with large
!    imbalances easier to understand.  Note that we recommend roughly equal
!    numbers of ham and spam are trained.
!  o Add a warning for those with highly imbalanced ham and spam.
!  o Improved the 'Show Clues' results page.
!  o When we fail to add the 'Spam' field to a read-only store (eg, hotmail),
!    complain less loudly.
  
  POP3 Proxy / SMTP Proxy
  -----------------------
!  o An error where a failure message would be printed by
!    the SMTP proxy, even on success, was fixed.
  
  Web Interface
  -------------
!  o The bug which caused the "TypeError" when trying to access
!    the database after setting a configuration option via the
!    interface has been fixed.
  
  POP3 Proxy Service / POP3 Proxy Tray Application
  ------------------------------------------------
  
!  o Both the pop3proxy_service.py and pop3proxy_tray.py
!    scripts are now installed (with "setup.py install") if
!    the user is using Windows.
  
  IMAP Filter
  -----------
!  o Correctly handle IMAP servers that (wrongly) fail to put folder names
!    in quotation marks
!  o Count all messages being classified instead of just the ones from the
!    last folder.
!  o Handle a folder name as a literal when presenting a list to choose from.
!  o Handle IMAP servers that do not pass a blank result line for an empty
!    search.
!  o Fix IMAP over SSL.
  
  General
  -------
!  o Various improvements have been made to the management of the
!    'message info' database.  As outlined above, it will now be
!    stored as a pickle, if your statistics database uses a pickle.
!    In addition, we attempt to close the database when we should,
!    and make sure that we explicitly update it.  This should hopefully
!    go some way to solving the "DB_RUN_RECOVERY" errors that have
!    been regularly reported - we would be interested to hear from
!    you if upgrading to 1.0a7 does appear to solve this problem
!    for you (email spambayes at python.org).
!  o We now try to determine the type of dbm storage used from the
!    file, if one already exists.  This should make the transition
!    between formats a little easier.
!  o Fix sb_xmlrpcserver to work with the renamed (since 1.0a5)
!    scripts.
!  o Fix the sense of include_trained in sb_mboxtrain.
!    
  
  Transition
  ==========
! If you are transitioning from a version older than 1.0a6, please also
  read the notes in the previous release notes (accessible from
  <http://sourceforge.net/project/showfiles.php?group_id=61702>).
--- 17,57 ----
  --------------------------
  
!  o XXX
  
! There should be no other incompatible changes (from 1.0a7) in this release.
  
  
! -------------------
! ** Other changes **
! -------------------
  
  Outlook Plugin
  --------------
!  o 
  
  POP3 Proxy / SMTP Proxy
  -----------------------
!  o 
  
  Web Interface
  -------------
!  o 
  
  POP3 Proxy Service / POP3 Proxy Tray Application
  ------------------------------------------------
  
!  o 
  
  IMAP Filter
  -----------
!  o 
  
  General
  -------
!  o 
  
  Transition
  ==========
! If you are transitioning from a version older than 1.0a7, please also
  read the notes in the previous release notes (accessible from
  <http://sourceforge.net/project/showfiles.php?group_id=61702>).
***************
*** 144,148 ****
  ===================
  The following bugs tracked via the Sourceforge system were fixed:
! 809769, 814322, 816400, 810342, 818552
  
  A URL containing the details of these bugs can be made by appending the
--- 64,68 ----
  ===================
  The following bugs tracked via the Sourceforge system were fixed:
! 
  
  A URL containing the details of these bugs can be made by appending the
***************
*** 161,162 ****
--- 81,110 ----
  No patches tracked via the Sourceforge system were integrated for this
  release.
+ 
+ 
+ Deprecated Options
+ ==================
+ 
+ [add explanation of deprecated options here]
+ 
+ The following options have been deprecated in this release:
+   o [Tokenizer] generate_time_buckets
+   o [Tokenizer] extract_dow
+   o [Classifier] experimental_ham_spam_imbalance_adjustment
+ 
+ 
+ New Experimental Options
+ ========================
+ 
+ [add explanation of experimental options and a pointer to the testing
+ setup here]
+ 
+ The following experimental options have been added in this release:
+   o [Tokenizer] search_for_habeas_headers
+   o [Tokenizer] reduce_habeas_headers
+   o [Classifier] use_bigrams
+   o [URLRetriever] slurp_urls
+   o [URLRetriever] cache_expiry_days
+   o [URLRetriever] cache_directory
+   o [URLRetriever] only_slurp_base
+   o [URLRetriever] web_prefix





More information about the Spambayes-checkins mailing list