[Spambayes] new python error in sbfilter.py

Fred Smith fredex at fcshome.stoneham.ma.us
Thu Mar 9 22:34:50 EST 2017


On Thu, Mar 09, 2017 at 01:57:27PM -0500, Fred Smith wrote:
later....
see below
> On Thu, Mar 09, 2017 at 07:23:38AM -0600, Skip Montanaro wrote:
> >    Fred,
> >    It looks like your training database is corrupt. At the very end of the
> >    long traceback, the message indicates that the count of messages (ham
> >    or spam) in which a particular word appears is greater than the number
> >    of messages in that particular category. I think you should be able to
> >    just retrain from scratch on your existing database.
> >    Skip
> 
> Sigh.
> 
> That worked. for a little while. then it started doing it again.
> 
> I've recently started using these macros in mutt:
> 
> 	macro index S "|sb_filter.py -s -f | procmail\&\nd"
> 	macro pager S "|sb_filter.py -s -f | procmail\&\nd"
> 	macro index H "|sb_filter.py -g -f | procmail\&\nd"
> 	macro pager H "|sb_filter.py -g -f | procmail\&\nd"
> 
> and in procmail there are these rules:
> 
> 	:0 fw:hamlock
> 	| /usr/bin/sb_filter.py -f -d $HOME/.hammiedb

Ah HA! BINGO!
that's the problem right there... the macros (above) train on the mail
then hand it to procmail. Procmail trains it AGAIN, thereby doubling up
every mail that gets trained that way in the database.

Those macros are a really HANDY way to fix an incorrect training
while putting it in the right folder. Is there a way anyone can think
of that avoids the double training?

thanks in advance!

> 	# then filter out spam and unsure stuff....
> 	:0
> 	* ^X-Spambayes-Classification: spam
> 	$HOME/Mail/trained.spam
> 
> 	:0
> 	* ^X-Spambayes-Classification: unsure
> 	$HOME/Mail/unsure
> 
> I don't see why those macros would cause such a problem, but it
> has started only since I started using them (of course, I also blew
> away the ancient hammie db and started over with a small corpus of
> known ham and spam, at the same time).
> 
> Prior to that I would just save mis-filed mails in either trained.spam
> or trained.ham and trust that the nightly retraining would do the right
> thing.
> 
> any further ideas?
> 
> thanks in advance!
> 
> Fred
> 
> > 
> >    On Mar 8, 2017 7:11 PM, "Fred Smith" <[1]fredex at fcshome.stoneham.ma.us>
> >    wrote:
> > 
> >      Hi
> >      All of a sudden this past week I'm getting this whenever a message
> >      is
> >      sent to sb_filter to be retrained:
> >      File "/usr/bin/sb_filter.py", line 5, in <module>
> >          pkg_resources.run_script('spambayes==1.1a6', 'sb_filter.py')
> >        File "/usr/lib/python2.7/site-packages/pkg_resources.py", line
> >      540, in run_script
> >          self.require(requires)[0].run_script(script_name, ns)
> >        File "/usr/lib/python2.7/site-packages/pkg_resources.py", line
> >      1455, in run_script
> >          execfile(script_filename, namespace, namespace)
> >        File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.
> >      7.egg/EGG-INFO/scripts/sb_filter.py", line 277, in <module>
> >          main()
> >        File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.
> >      7.egg/EGG-INFO/scripts/sb_filter.py", line 268, in main
> >          action(msg)
> >        File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.
> >      7.egg/EGG-INFO/scripts/sb_filter.py", line 186, in filter
> >          return self.h.filter(msg)
> >        File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.
> >      7.egg/spambayes/hammie.py", line 149, in filter
> >          debug, train)
> >        File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.
> >      7.egg/spambayes/hammie.py", line 104, in score_and_filter
> >          prob, clues = self._scoremsg(msg, True)
> >        File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.
> >      7.egg/spambayes/hammie.py", line 33, in _scoremsg
> >          return self.bayes.spamprob(tokenize(msg), evidence)
> >        File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.
> >      7.egg/spambayes/classifier.py", line 169, in chi2_spamprob
> >          clues = self._getclues(wordstream)
> >        File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.
> >      7.egg/spambayes/classifier.py", line 472, in _getclues
> >          tup = self._worddistanceget(word)
> >        File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.
> >      7.egg/spambayes/classifier.py", line 487, in _worddistanceget
> >          prob = self.probability(record)
> >        File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.
> >      7.egg/spambayes/classifier.py", line 287, in probability
> >          assert hamcount <= nham, "Token seen in more ham than ham
> >      trained."
> >      AssertionError: Token seen in more ham than ham trained.
> >      It is possible I got a python update, but I wasn't paying attention,
> >      so
> >      I'm not at all sure.
> >      I'm NOT a python guru, so I'd appreciate any guidance any of you can
> >      provide.
> >      thanks in advance!
> >      Fred
> >      --
> >      ---- Fred Smith -- [2]fredex at fcshome.stoneham.ma.us
> >      -----------------------------
> >                          The Lord detests the way of the wicked
> >                        but he loves those who pursue righteousness.
> >      ----------------------------- Proverbs 15:9 (niv)
> >      -----------------------------
> >      _______________________________________________
> >      [3]SpamBayes at python.org
> >      [4]https://mail.python.org/mailman/listinfo/spambayes
> >      Info/Unsubscribe: [5]http://mail.python.org/
> >      mailman/listinfo/spambayes
> >      Check the FAQ before asking: [6]http://spambayes.sf.net/faq.html
> > 
> > References
> > 
> >    1. mailto:fredex at fcshome.stoneham.ma.us
> >    2. mailto:fredex at fcshome.stoneham.ma.us
> >    3. mailto:SpamBayes at python.org
> >    4. https://mail.python.org/mailman/listinfo/spambayesInfo/Unsubscribe
> >    5. http://mail.python.org/mailman/listinfo/spambayes
> >    6. http://spambayes.sf.net/faq.html
> 
> -- 
> ---- Fred Smith -- fredex at fcshome.stoneham.ma.us -----------------------------
>                         The Lord is like a strong tower. 
>              Those who do what is right can run to him for safety.
> --------------------------- Proverbs 18:10 (niv) -----------------------------

-- 
-------------------------------------------------------------------------------
 .----    Fred Smith   /              
( /__  ,__.   __   __ /  __   : /     
 /    /  /   /__) /  /  /__) .+'           Home: fredex at fcshome.stoneham.ma.us 
/    /  (__ (___ (__(_ (___ / :__                                 781-438-5471 
-------------------------------- Jude 1:24,25 ---------------------------------


More information about the SpamBayes mailing list