[Spambayes] new python error in sbfilter.py
Skip Montanaro
skip.montanaro at gmail.com
Fri Mar 10 18:02:16 EST 2017
I would avoid training on every message in your procmailrc file, and only
use the mitt macros to train on misses and unsures. I would only use a
procmail recipe to score incoming messages.
Skip
On Mar 9, 2017 9:35 PM, "Fred Smith" <fredex at fcshome.stoneham.ma.us> wrote:
> On Thu, Mar 09, 2017 at 01:57:27PM -0500, Fred Smith wrote:
> later....
> see below
> > On Thu, Mar 09, 2017 at 07:23:38AM -0600, Skip Montanaro wrote:
> > > Fred,
> > > It looks like your training database is corrupt. At the very end of
> the
> > > long traceback, the message indicates that the count of messages
> (ham
> > > or spam) in which a particular word appears is greater than the
> number
> > > of messages in that particular category. I think you should be able
> to
> > > just retrain from scratch on your existing database.
> > > Skip
> >
> > Sigh.
> >
> > That worked. for a little while. then it started doing it again.
> >
> > I've recently started using these macros in mutt:
> >
> > macro index S "|sb_filter.py -s -f | procmail\&\nd"
> > macro pager S "|sb_filter.py -s -f | procmail\&\nd"
> > macro index H "|sb_filter.py -g -f | procmail\&\nd"
> > macro pager H "|sb_filter.py -g -f | procmail\&\nd"
> >
> > and in procmail there are these rules:
> >
> > :0 fw:hamlock
> > | /usr/bin/sb_filter.py -f -d $HOME/.hammiedb
>
> Ah HA! BINGO!
> that's the problem right there... the macros (above) train on the mail
> then hand it to procmail. Procmail trains it AGAIN, thereby doubling up
> every mail that gets trained that way in the database.
>
> Those macros are a really HANDY way to fix an incorrect training
> while putting it in the right folder. Is there a way anyone can think
> of that avoids the double training?
>
> thanks in advance!
>
> > # then filter out spam and unsure stuff....
> > :0
> > * ^X-Spambayes-Classification: spam
> > $HOME/Mail/trained.spam
> >
> > :0
> > * ^X-Spambayes-Classification: unsure
> > $HOME/Mail/unsure
> >
> > I don't see why those macros would cause such a problem, but it
> > has started only since I started using them (of course, I also blew
> > away the ancient hammie db and started over with a small corpus of
> > known ham and spam, at the same time).
> >
> > Prior to that I would just save mis-filed mails in either trained.spam
> > or trained.ham and trust that the nightly retraining would do the right
> > thing.
> >
> > any further ideas?
> >
> > thanks in advance!
> >
> > Fred
> >
> > >
> > > On Mar 8, 2017 7:11 PM, "Fred Smith" <[1]fredex at fcshome.stoneham.
> ma.us>
> > > wrote:
> > >
> > > Hi
> > > All of a sudden this past week I'm getting this whenever a message
> > > is
> > > sent to sb_filter to be retrained:
> > > File "/usr/bin/sb_filter.py", line 5, in <module>
> > > pkg_resources.run_script('spambayes==1.1a6', 'sb_filter.py')
> > > File "/usr/lib/python2.7/site-packages/pkg_resources.py", line
> > > 540, in run_script
> > > self.require(requires)[0].run_script(script_name, ns)
> > > File "/usr/lib/python2.7/site-packages/pkg_resources.py", line
> > > 1455, in run_script
> > > execfile(script_filename, namespace, namespace)
> > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.
> > > 7.egg/EGG-INFO/scripts/sb_filter.py", line 277, in <module>
> > > main()
> > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.
> > > 7.egg/EGG-INFO/scripts/sb_filter.py", line 268, in main
> > > action(msg)
> > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.
> > > 7.egg/EGG-INFO/scripts/sb_filter.py", line 186, in filter
> > > return self.h.filter(msg)
> > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.
> > > 7.egg/spambayes/hammie.py", line 149, in filter
> > > debug, train)
> > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.
> > > 7.egg/spambayes/hammie.py", line 104, in score_and_filter
> > > prob, clues = self._scoremsg(msg, True)
> > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.
> > > 7.egg/spambayes/hammie.py", line 33, in _scoremsg
> > > return self.bayes.spamprob(tokenize(msg), evidence)
> > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.
> > > 7.egg/spambayes/classifier.py", line 169, in chi2_spamprob
> > > clues = self._getclues(wordstream)
> > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.
> > > 7.egg/spambayes/classifier.py", line 472, in _getclues
> > > tup = self._worddistanceget(word)
> > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.
> > > 7.egg/spambayes/classifier.py", line 487, in _worddistanceget
> > > prob = self.probability(record)
> > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.
> > > 7.egg/spambayes/classifier.py", line 287, in probability
> > > assert hamcount <= nham, "Token seen in more ham than ham
> > > trained."
> > > AssertionError: Token seen in more ham than ham trained.
> > > It is possible I got a python update, but I wasn't paying
> attention,
> > > so
> > > I'm not at all sure.
> > > I'm NOT a python guru, so I'd appreciate any guidance any of you
> can
> > > provide.
> > > thanks in advance!
> > > Fred
> > > --
> > > ---- Fred Smith -- [2]fredex at fcshome.stoneham.ma.us
> > > -----------------------------
> > > The Lord detests the way of the wicked
> > > but he loves those who pursue righteousness.
> > > ----------------------------- Proverbs 15:9 (niv)
> > > -----------------------------
> > > _______________________________________________
> > > [3]SpamBayes at python.org
> > > [4]https://mail.python.org/mailman/listinfo/spambayes
> > > Info/Unsubscribe: [5]http://mail.python.org/
> > > mailman/listinfo/spambayes
> > > Check the FAQ before asking: [6]http://spambayes.sf.net/faq.html
> > >
> > > References
> > >
> > > 1. mailto:fredex at fcshome.stoneham.ma.us
> > > 2. mailto:fredex at fcshome.stoneham.ma.us
> > > 3. mailto:SpamBayes at python.org
> > > 4. https://mail.python.org/mailman/listinfo/
> spambayesInfo/Unsubscribe
> > > 5. http://mail.python.org/mailman/listinfo/spambayes
> > > 6. http://spambayes.sf.net/faq.html
> >
> > --
> > ---- Fred Smith -- fredex at fcshome.stoneham.ma.us
> -----------------------------
> > The Lord is like a strong tower.
> > Those who do what is right can run to him for safety.
> > --------------------------- Proverbs 18:10 (niv)
> -----------------------------
>
> --
> ------------------------------------------------------------
> -------------------
> .---- Fred Smith /
> ( /__ ,__. __ __ / __ : /
> / / / /__) / / /__) .+' Home:
> fredex at fcshome.stoneham.ma.us
> / / (__ (___ (__(_ (___ / :__
> 781-438-5471
> -------------------------------- Jude 1:24,25
> ---------------------------------
> _______________________________________________
> SpamBayes at python.org
> https://mail.python.org/mailman/listinfo/spambayes
> Info/Unsubscribe: http://mail.python.org/mailman/listinfo/spambayes
> Check the FAQ before asking: http://spambayes.sf.net/faq.html
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/spambayes/attachments/20170310/03085c8e/attachment.html>
More information about the SpamBayes
mailing list