From fredex at fcshome.stoneham.ma.us Wed Mar 8 19:52:44 2017 From: fredex at fcshome.stoneham.ma.us (Fred Smith) Date: Wed, 8 Mar 2017 19:52:44 -0500 Subject: [Spambayes] new python error in sbfilter.py In-Reply-To: <20170309005103.GD12915@fcshome.stoneham.ma.us> References: <20170309005103.GD12915@fcshome.stoneham.ma.us> Message-ID: <20170309005244.GE12915@fcshome.stoneham.ma.us> On Wed, Mar 08, 2017 at 07:51:03PM -0500, Fred Smith wrote: > Hi > > All of a sudden this past week I'm getting this whenever a message is > sent to sb_filter to be retrained: > > File "/usr/bin/sb_filter.py", line 5, in > pkg_resources.run_script('spambayes==1.1a6', 'sb_filter.py') > File "/usr/lib/python2.7/site-packages/pkg_resources.py", line 540, in run_script > self.require(requires)[0].run_script(script_name, ns) > File "/usr/lib/python2.7/site-packages/pkg_resources.py", line 1455, in run_script > execfile(script_filename, namespace, namespace) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.7.egg/EGG-INFO/scripts/sb_filter.py", line 277, in > main() > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.7.egg/EGG-INFO/scripts/sb_filter.py", line 268, in main > action(msg) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.7.egg/EGG-INFO/scripts/sb_filter.py", line 186, in filter > return self.h.filter(msg) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.7.egg/spambayes/hammie.py", line 149, in filter > debug, train) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.7.egg/spambayes/hammie.py", line 104, in score_and_filter > prob, clues = self._scoremsg(msg, True) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.7.egg/spambayes/hammie.py", line 33, in _scoremsg > return self.bayes.spamprob(tokenize(msg), evidence) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.7.egg/spambayes/classifier.py", line 169, in chi2_spamprob > clues = self._getclues(wordstream) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.7.egg/spambayes/classifier.py", line 472, in _getclues > tup = self._worddistanceget(word) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.7.egg/spambayes/classifier.py", line 487, in _worddistanceget > prob = self.probability(record) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.7.egg/spambayes/classifier.py", line 287, in probability > assert hamcount <= nham, "Token seen in more ham than ham trained." > AssertionError: Token seen in more ham than ham trained. > One small addition: the nightly training pass doesn't produce any such errors. > > It is possible I got a python update, but I wasn't paying attention, so > I'm not at all sure. > > I'm NOT a python guru, so I'd appreciate any guidance any of you can > provide. > > thanks in advance! > > Fred > -- > ---- Fred Smith -- fredex at fcshome.stoneham.ma.us ----------------------------- > The Lord detests the way of the wicked > but he loves those who pursue righteousness. > ----------------------------- Proverbs 15:9 (niv) ----------------------------- -- ------------------------------------------------------------------------------- Under no circumstances will I ever purchase anything offered to me as the result of an unsolicited e-mail message. Nor will I forward chain letters, petitions, mass mailings, or virus warnings to large numbers of others. This is my contribution to the survival of the online community. --Roger Ebert, December, 1996 ----------------------------- The Boulder Pledge ----------------------------- From fredex at fcshome.stoneham.ma.us Wed Mar 8 19:51:03 2017 From: fredex at fcshome.stoneham.ma.us (Fred Smith) Date: Wed, 8 Mar 2017 19:51:03 -0500 Subject: [Spambayes] new python error in sbfilter.py Message-ID: <20170309005103.GD12915@fcshome.stoneham.ma.us> Hi All of a sudden this past week I'm getting this whenever a message is sent to sb_filter to be retrained: File "/usr/bin/sb_filter.py", line 5, in pkg_resources.run_script('spambayes==1.1a6', 'sb_filter.py') File "/usr/lib/python2.7/site-packages/pkg_resources.py", line 540, in run_script self.require(requires)[0].run_script(script_name, ns) File "/usr/lib/python2.7/site-packages/pkg_resources.py", line 1455, in run_script execfile(script_filename, namespace, namespace) File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.7.egg/EGG-INFO/scripts/sb_filter.py", line 277, in main() File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.7.egg/EGG-INFO/scripts/sb_filter.py", line 268, in main action(msg) File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.7.egg/EGG-INFO/scripts/sb_filter.py", line 186, in filter return self.h.filter(msg) File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.7.egg/spambayes/hammie.py", line 149, in filter debug, train) File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.7.egg/spambayes/hammie.py", line 104, in score_and_filter prob, clues = self._scoremsg(msg, True) File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.7.egg/spambayes/hammie.py", line 33, in _scoremsg return self.bayes.spamprob(tokenize(msg), evidence) File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.7.egg/spambayes/classifier.py", line 169, in chi2_spamprob clues = self._getclues(wordstream) File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.7.egg/spambayes/classifier.py", line 472, in _getclues tup = self._worddistanceget(word) File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.7.egg/spambayes/classifier.py", line 487, in _worddistanceget prob = self.probability(record) File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.7.egg/spambayes/classifier.py", line 287, in probability assert hamcount <= nham, "Token seen in more ham than ham trained." AssertionError: Token seen in more ham than ham trained. It is possible I got a python update, but I wasn't paying attention, so I'm not at all sure. I'm NOT a python guru, so I'd appreciate any guidance any of you can provide. thanks in advance! Fred -- ---- Fred Smith -- fredex at fcshome.stoneham.ma.us ----------------------------- The Lord detests the way of the wicked but he loves those who pursue righteousness. ----------------------------- Proverbs 15:9 (niv) ----------------------------- From skip.montanaro at gmail.com Thu Mar 9 08:23:38 2017 From: skip.montanaro at gmail.com (Skip Montanaro) Date: Thu, 9 Mar 2017 07:23:38 -0600 Subject: [Spambayes] new python error in sbfilter.py In-Reply-To: <20170309005103.GD12915@fcshome.stoneham.ma.us> References: <20170309005103.GD12915@fcshome.stoneham.ma.us> Message-ID: Fred, It looks like your training database is corrupt. At the very end of the long traceback, the message indicates that the count of messages (ham or spam) in which a particular word appears is greater than the number of messages in that particular category. I think you should be able to just retrain from scratch on your existing database. Skip On Mar 8, 2017 7:11 PM, "Fred Smith" wrote: > Hi > > All of a sudden this past week I'm getting this whenever a message is > sent to sb_filter to be retrained: > > File "/usr/bin/sb_filter.py", line 5, in > pkg_resources.run_script('spambayes==1.1a6', 'sb_filter.py') > File "/usr/lib/python2.7/site-packages/pkg_resources.py", line 540, in > run_script > self.require(requires)[0].run_script(script_name, ns) > File "/usr/lib/python2.7/site-packages/pkg_resources.py", line 1455, in > run_script > execfile(script_filename, namespace, namespace) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/EGG-INFO/scripts/sb_filter.py", line 277, in > main() > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/EGG-INFO/scripts/sb_filter.py", line 268, in main > action(msg) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/EGG-INFO/scripts/sb_filter.py", line 186, in filter > return self.h.filter(msg) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.7.egg/spambayes/hammie.py", > line 149, in filter > debug, train) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.7.egg/spambayes/hammie.py", > line 104, in score_and_filter > prob, clues = self._scoremsg(msg, True) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2.7.egg/spambayes/hammie.py", > line 33, in _scoremsg > return self.bayes.spamprob(tokenize(msg), evidence) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/spambayes/classifier.py", line 169, in chi2_spamprob > clues = self._getclues(wordstream) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/spambayes/classifier.py", line 472, in _getclues > tup = self._worddistanceget(word) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/spambayes/classifier.py", line 487, in _worddistanceget > prob = self.probability(record) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/spambayes/classifier.py", line 287, in probability > assert hamcount <= nham, "Token seen in more ham than ham trained." > AssertionError: Token seen in more ham than ham trained. > > > It is possible I got a python update, but I wasn't paying attention, so > I'm not at all sure. > > I'm NOT a python guru, so I'd appreciate any guidance any of you can > provide. > > thanks in advance! > > Fred > -- > ---- Fred Smith -- fredex at fcshome.stoneham.ma.us > ----------------------------- > The Lord detests the way of the wicked > but he loves those who pursue righteousness. > ----------------------------- Proverbs 15:9 (niv) > ----------------------------- > _______________________________________________ > SpamBayes at python.org > https://mail.python.org/mailman/listinfo/spambayes > Info/Unsubscribe: http://mail.python.org/mailman/listinfo/spambayes > Check the FAQ before asking: http://spambayes.sf.net/faq.html > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fredex at fcshome.stoneham.ma.us Thu Mar 9 09:37:45 2017 From: fredex at fcshome.stoneham.ma.us (Fred Smith) Date: Thu, 9 Mar 2017 09:37:45 -0500 Subject: [Spambayes] new python error in sbfilter.py In-Reply-To: References: <20170309005103.GD12915@fcshome.stoneham.ma.us> Message-ID: <20170309143745.GA3173@fcshome.stoneham.ma.us> On Thu, Mar 09, 2017 at 07:23:38AM -0600, Skip Montanaro wrote: > Fred, > It looks like your training database is corrupt. At the very end of the > long traceback, the message indicates that the count of messages (ham > or spam) in which a particular word appears is greater than the number > of messages in that particular category. I think you should be able to > just retrain from scratch on your existing database. > Skip thanks Skip! I wondered if that was the case, but thought to ask before messing with it. I'll give that a whirl. thanks again! Fred > > On Mar 8, 2017 7:11 PM, "Fred Smith" <[1]fredex at fcshome.stoneham.ma.us> > wrote: > > Hi > All of a sudden this past week I'm getting this whenever a message > is > sent to sb_filter to be retrained: > File "/usr/bin/sb_filter.py", line 5, in > pkg_resources.run_script('spambayes==1.1a6', 'sb_filter.py') > File "/usr/lib/python2.7/site-packages/pkg_resources.py", line > 540, in run_script > self.require(requires)[0].run_script(script_name, ns) > File "/usr/lib/python2.7/site-packages/pkg_resources.py", line > 1455, in run_script > execfile(script_filename, namespace, namespace) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/EGG-INFO/scripts/sb_filter.py", line 277, in > main() > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/EGG-INFO/scripts/sb_filter.py", line 268, in main > action(msg) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/EGG-INFO/scripts/sb_filter.py", line 186, in filter > return self.h.filter(msg) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/spambayes/hammie.py", line 149, in filter > debug, train) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/spambayes/hammie.py", line 104, in score_and_filter > prob, clues = self._scoremsg(msg, True) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/spambayes/hammie.py", line 33, in _scoremsg > return self.bayes.spamprob(tokenize(msg), evidence) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/spambayes/classifier.py", line 169, in chi2_spamprob > clues = self._getclues(wordstream) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/spambayes/classifier.py", line 472, in _getclues > tup = self._worddistanceget(word) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/spambayes/classifier.py", line 487, in _worddistanceget > prob = self.probability(record) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/spambayes/classifier.py", line 287, in probability > assert hamcount <= nham, "Token seen in more ham than ham > trained." > AssertionError: Token seen in more ham than ham trained. > It is possible I got a python update, but I wasn't paying attention, > so > I'm not at all sure. > I'm NOT a python guru, so I'd appreciate any guidance any of you can > provide. > thanks in advance! > Fred > -- > ---- Fred Smith -- [2]fredex at fcshome.stoneham.ma.us > ----------------------------- > The Lord detests the way of the wicked > but he loves those who pursue righteousness. > ----------------------------- Proverbs 15:9 (niv) > ----------------------------- > _______________________________________________ > [3]SpamBayes at python.org > [4]https://mail.python.org/mailman/listinfo/spambayes > Info/Unsubscribe: [5]http://mail.python.org/ > mailman/listinfo/spambayes > Check the FAQ before asking: [6]http://spambayes.sf.net/faq.html > > References > > 1. mailto:fredex at fcshome.stoneham.ma.us > 2. mailto:fredex at fcshome.stoneham.ma.us > 3. mailto:SpamBayes at python.org > 4. https://mail.python.org/mailman/listinfo/spambayesInfo/Unsubscribe > 5. http://mail.python.org/mailman/listinfo/spambayes > 6. http://spambayes.sf.net/faq.html -- ---- Fred Smith -- fredex at fcshome.stoneham.ma.us ---------------------------- Do you not know? Have you not heard? The LORD is the everlasting God, the Creator of the ends of the earth. He will not grow tired or weary, and his understanding no one can fathom. ----------------------------- Isaiah 40:28 (niv) ----------------------------- From fredex at fcshome.stoneham.ma.us Thu Mar 9 13:57:27 2017 From: fredex at fcshome.stoneham.ma.us (Fred Smith) Date: Thu, 9 Mar 2017 13:57:27 -0500 Subject: [Spambayes] new python error in sbfilter.py In-Reply-To: References: <20170309005103.GD12915@fcshome.stoneham.ma.us> Message-ID: <20170309185727.GA7277@fcshome.stoneham.ma.us> On Thu, Mar 09, 2017 at 07:23:38AM -0600, Skip Montanaro wrote: > Fred, > It looks like your training database is corrupt. At the very end of the > long traceback, the message indicates that the count of messages (ham > or spam) in which a particular word appears is greater than the number > of messages in that particular category. I think you should be able to > just retrain from scratch on your existing database. > Skip Sigh. That worked. for a little while. then it started doing it again. I've recently started using these macros in mutt: macro index S "|sb_filter.py -s -f | procmail\&\nd" macro pager S "|sb_filter.py -s -f | procmail\&\nd" macro index H "|sb_filter.py -g -f | procmail\&\nd" macro pager H "|sb_filter.py -g -f | procmail\&\nd" and in procmail there are these rules: :0 fw:hamlock | /usr/bin/sb_filter.py -f -d $HOME/.hammiedb # then filter out spam and unsure stuff.... :0 * ^X-Spambayes-Classification: spam $HOME/Mail/trained.spam :0 * ^X-Spambayes-Classification: unsure $HOME/Mail/unsure I don't see why those macros would cause such a problem, but it has started only since I started using them (of course, I also blew away the ancient hammie db and started over with a small corpus of known ham and spam, at the same time). Prior to that I would just save mis-filed mails in either trained.spam or trained.ham and trust that the nightly retraining would do the right thing. any further ideas? thanks in advance! Fred > > On Mar 8, 2017 7:11 PM, "Fred Smith" <[1]fredex at fcshome.stoneham.ma.us> > wrote: > > Hi > All of a sudden this past week I'm getting this whenever a message > is > sent to sb_filter to be retrained: > File "/usr/bin/sb_filter.py", line 5, in > pkg_resources.run_script('spambayes==1.1a6', 'sb_filter.py') > File "/usr/lib/python2.7/site-packages/pkg_resources.py", line > 540, in run_script > self.require(requires)[0].run_script(script_name, ns) > File "/usr/lib/python2.7/site-packages/pkg_resources.py", line > 1455, in run_script > execfile(script_filename, namespace, namespace) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/EGG-INFO/scripts/sb_filter.py", line 277, in > main() > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/EGG-INFO/scripts/sb_filter.py", line 268, in main > action(msg) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/EGG-INFO/scripts/sb_filter.py", line 186, in filter > return self.h.filter(msg) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/spambayes/hammie.py", line 149, in filter > debug, train) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/spambayes/hammie.py", line 104, in score_and_filter > prob, clues = self._scoremsg(msg, True) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/spambayes/hammie.py", line 33, in _scoremsg > return self.bayes.spamprob(tokenize(msg), evidence) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/spambayes/classifier.py", line 169, in chi2_spamprob > clues = self._getclues(wordstream) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/spambayes/classifier.py", line 472, in _getclues > tup = self._worddistanceget(word) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/spambayes/classifier.py", line 487, in _worddistanceget > prob = self.probability(record) > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > 7.egg/spambayes/classifier.py", line 287, in probability > assert hamcount <= nham, "Token seen in more ham than ham > trained." > AssertionError: Token seen in more ham than ham trained. > It is possible I got a python update, but I wasn't paying attention, > so > I'm not at all sure. > I'm NOT a python guru, so I'd appreciate any guidance any of you can > provide. > thanks in advance! > Fred > -- > ---- Fred Smith -- [2]fredex at fcshome.stoneham.ma.us > ----------------------------- > The Lord detests the way of the wicked > but he loves those who pursue righteousness. > ----------------------------- Proverbs 15:9 (niv) > ----------------------------- > _______________________________________________ > [3]SpamBayes at python.org > [4]https://mail.python.org/mailman/listinfo/spambayes > Info/Unsubscribe: [5]http://mail.python.org/ > mailman/listinfo/spambayes > Check the FAQ before asking: [6]http://spambayes.sf.net/faq.html > > References > > 1. mailto:fredex at fcshome.stoneham.ma.us > 2. mailto:fredex at fcshome.stoneham.ma.us > 3. mailto:SpamBayes at python.org > 4. https://mail.python.org/mailman/listinfo/spambayesInfo/Unsubscribe > 5. http://mail.python.org/mailman/listinfo/spambayes > 6. http://spambayes.sf.net/faq.html -- ---- Fred Smith -- fredex at fcshome.stoneham.ma.us ----------------------------- The Lord is like a strong tower. Those who do what is right can run to him for safety. --------------------------- Proverbs 18:10 (niv) ----------------------------- From fredex at fcshome.stoneham.ma.us Thu Mar 9 22:34:50 2017 From: fredex at fcshome.stoneham.ma.us (Fred Smith) Date: Thu, 9 Mar 2017 22:34:50 -0500 Subject: [Spambayes] new python error in sbfilter.py In-Reply-To: <20170309185727.GA7277@fcshome.stoneham.ma.us> References: <20170309005103.GD12915@fcshome.stoneham.ma.us> <20170309185727.GA7277@fcshome.stoneham.ma.us> Message-ID: <20170310033450.GA15312@fcshome.stoneham.ma.us> On Thu, Mar 09, 2017 at 01:57:27PM -0500, Fred Smith wrote: later.... see below > On Thu, Mar 09, 2017 at 07:23:38AM -0600, Skip Montanaro wrote: > > Fred, > > It looks like your training database is corrupt. At the very end of the > > long traceback, the message indicates that the count of messages (ham > > or spam) in which a particular word appears is greater than the number > > of messages in that particular category. I think you should be able to > > just retrain from scratch on your existing database. > > Skip > > Sigh. > > That worked. for a little while. then it started doing it again. > > I've recently started using these macros in mutt: > > macro index S "|sb_filter.py -s -f | procmail\&\nd" > macro pager S "|sb_filter.py -s -f | procmail\&\nd" > macro index H "|sb_filter.py -g -f | procmail\&\nd" > macro pager H "|sb_filter.py -g -f | procmail\&\nd" > > and in procmail there are these rules: > > :0 fw:hamlock > | /usr/bin/sb_filter.py -f -d $HOME/.hammiedb Ah HA! BINGO! that's the problem right there... the macros (above) train on the mail then hand it to procmail. Procmail trains it AGAIN, thereby doubling up every mail that gets trained that way in the database. Those macros are a really HANDY way to fix an incorrect training while putting it in the right folder. Is there a way anyone can think of that avoids the double training? thanks in advance! > # then filter out spam and unsure stuff.... > :0 > * ^X-Spambayes-Classification: spam > $HOME/Mail/trained.spam > > :0 > * ^X-Spambayes-Classification: unsure > $HOME/Mail/unsure > > I don't see why those macros would cause such a problem, but it > has started only since I started using them (of course, I also blew > away the ancient hammie db and started over with a small corpus of > known ham and spam, at the same time). > > Prior to that I would just save mis-filed mails in either trained.spam > or trained.ham and trust that the nightly retraining would do the right > thing. > > any further ideas? > > thanks in advance! > > Fred > > > > > On Mar 8, 2017 7:11 PM, "Fred Smith" <[1]fredex at fcshome.stoneham.ma.us> > > wrote: > > > > Hi > > All of a sudden this past week I'm getting this whenever a message > > is > > sent to sb_filter to be retrained: > > File "/usr/bin/sb_filter.py", line 5, in > > pkg_resources.run_script('spambayes==1.1a6', 'sb_filter.py') > > File "/usr/lib/python2.7/site-packages/pkg_resources.py", line > > 540, in run_script > > self.require(requires)[0].run_script(script_name, ns) > > File "/usr/lib/python2.7/site-packages/pkg_resources.py", line > > 1455, in run_script > > execfile(script_filename, namespace, namespace) > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > > 7.egg/EGG-INFO/scripts/sb_filter.py", line 277, in > > main() > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > > 7.egg/EGG-INFO/scripts/sb_filter.py", line 268, in main > > action(msg) > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > > 7.egg/EGG-INFO/scripts/sb_filter.py", line 186, in filter > > return self.h.filter(msg) > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > > 7.egg/spambayes/hammie.py", line 149, in filter > > debug, train) > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > > 7.egg/spambayes/hammie.py", line 104, in score_and_filter > > prob, clues = self._scoremsg(msg, True) > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > > 7.egg/spambayes/hammie.py", line 33, in _scoremsg > > return self.bayes.spamprob(tokenize(msg), evidence) > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > > 7.egg/spambayes/classifier.py", line 169, in chi2_spamprob > > clues = self._getclues(wordstream) > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > > 7.egg/spambayes/classifier.py", line 472, in _getclues > > tup = self._worddistanceget(word) > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > > 7.egg/spambayes/classifier.py", line 487, in _worddistanceget > > prob = self.probability(record) > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > > 7.egg/spambayes/classifier.py", line 287, in probability > > assert hamcount <= nham, "Token seen in more ham than ham > > trained." > > AssertionError: Token seen in more ham than ham trained. > > It is possible I got a python update, but I wasn't paying attention, > > so > > I'm not at all sure. > > I'm NOT a python guru, so I'd appreciate any guidance any of you can > > provide. > > thanks in advance! > > Fred > > -- > > ---- Fred Smith -- [2]fredex at fcshome.stoneham.ma.us > > ----------------------------- > > The Lord detests the way of the wicked > > but he loves those who pursue righteousness. > > ----------------------------- Proverbs 15:9 (niv) > > ----------------------------- > > _______________________________________________ > > [3]SpamBayes at python.org > > [4]https://mail.python.org/mailman/listinfo/spambayes > > Info/Unsubscribe: [5]http://mail.python.org/ > > mailman/listinfo/spambayes > > Check the FAQ before asking: [6]http://spambayes.sf.net/faq.html > > > > References > > > > 1. mailto:fredex at fcshome.stoneham.ma.us > > 2. mailto:fredex at fcshome.stoneham.ma.us > > 3. mailto:SpamBayes at python.org > > 4. https://mail.python.org/mailman/listinfo/spambayesInfo/Unsubscribe > > 5. http://mail.python.org/mailman/listinfo/spambayes > > 6. http://spambayes.sf.net/faq.html > > -- > ---- Fred Smith -- fredex at fcshome.stoneham.ma.us ----------------------------- > The Lord is like a strong tower. > Those who do what is right can run to him for safety. > --------------------------- Proverbs 18:10 (niv) ----------------------------- -- ------------------------------------------------------------------------------- .---- Fred Smith / ( /__ ,__. __ __ / __ : / / / / /__) / / /__) .+' Home: fredex at fcshome.stoneham.ma.us / / (__ (___ (__(_ (___ / :__ 781-438-5471 -------------------------------- Jude 1:24,25 --------------------------------- From skip.montanaro at gmail.com Fri Mar 10 18:02:16 2017 From: skip.montanaro at gmail.com (Skip Montanaro) Date: Fri, 10 Mar 2017 17:02:16 -0600 Subject: [Spambayes] new python error in sbfilter.py In-Reply-To: <20170310033450.GA15312@fcshome.stoneham.ma.us> References: <20170309005103.GD12915@fcshome.stoneham.ma.us> <20170309185727.GA7277@fcshome.stoneham.ma.us> <20170310033450.GA15312@fcshome.stoneham.ma.us> Message-ID: I would avoid training on every message in your procmailrc file, and only use the mitt macros to train on misses and unsures. I would only use a procmail recipe to score incoming messages. Skip On Mar 9, 2017 9:35 PM, "Fred Smith" wrote: > On Thu, Mar 09, 2017 at 01:57:27PM -0500, Fred Smith wrote: > later.... > see below > > On Thu, Mar 09, 2017 at 07:23:38AM -0600, Skip Montanaro wrote: > > > Fred, > > > It looks like your training database is corrupt. At the very end of > the > > > long traceback, the message indicates that the count of messages > (ham > > > or spam) in which a particular word appears is greater than the > number > > > of messages in that particular category. I think you should be able > to > > > just retrain from scratch on your existing database. > > > Skip > > > > Sigh. > > > > That worked. for a little while. then it started doing it again. > > > > I've recently started using these macros in mutt: > > > > macro index S "|sb_filter.py -s -f | procmail\&\nd" > > macro pager S "|sb_filter.py -s -f | procmail\&\nd" > > macro index H "|sb_filter.py -g -f | procmail\&\nd" > > macro pager H "|sb_filter.py -g -f | procmail\&\nd" > > > > and in procmail there are these rules: > > > > :0 fw:hamlock > > | /usr/bin/sb_filter.py -f -d $HOME/.hammiedb > > Ah HA! BINGO! > that's the problem right there... the macros (above) train on the mail > then hand it to procmail. Procmail trains it AGAIN, thereby doubling up > every mail that gets trained that way in the database. > > Those macros are a really HANDY way to fix an incorrect training > while putting it in the right folder. Is there a way anyone can think > of that avoids the double training? > > thanks in advance! > > > # then filter out spam and unsure stuff.... > > :0 > > * ^X-Spambayes-Classification: spam > > $HOME/Mail/trained.spam > > > > :0 > > * ^X-Spambayes-Classification: unsure > > $HOME/Mail/unsure > > > > I don't see why those macros would cause such a problem, but it > > has started only since I started using them (of course, I also blew > > away the ancient hammie db and started over with a small corpus of > > known ham and spam, at the same time). > > > > Prior to that I would just save mis-filed mails in either trained.spam > > or trained.ham and trust that the nightly retraining would do the right > > thing. > > > > any further ideas? > > > > thanks in advance! > > > > Fred > > > > > > > > On Mar 8, 2017 7:11 PM, "Fred Smith" <[1]fredex at fcshome.stoneham. > ma.us> > > > wrote: > > > > > > Hi > > > All of a sudden this past week I'm getting this whenever a message > > > is > > > sent to sb_filter to be retrained: > > > File "/usr/bin/sb_filter.py", line 5, in > > > pkg_resources.run_script('spambayes==1.1a6', 'sb_filter.py') > > > File "/usr/lib/python2.7/site-packages/pkg_resources.py", line > > > 540, in run_script > > > self.require(requires)[0].run_script(script_name, ns) > > > File "/usr/lib/python2.7/site-packages/pkg_resources.py", line > > > 1455, in run_script > > > execfile(script_filename, namespace, namespace) > > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > > > 7.egg/EGG-INFO/scripts/sb_filter.py", line 277, in > > > main() > > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > > > 7.egg/EGG-INFO/scripts/sb_filter.py", line 268, in main > > > action(msg) > > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > > > 7.egg/EGG-INFO/scripts/sb_filter.py", line 186, in filter > > > return self.h.filter(msg) > > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > > > 7.egg/spambayes/hammie.py", line 149, in filter > > > debug, train) > > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > > > 7.egg/spambayes/hammie.py", line 104, in score_and_filter > > > prob, clues = self._scoremsg(msg, True) > > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > > > 7.egg/spambayes/hammie.py", line 33, in _scoremsg > > > return self.bayes.spamprob(tokenize(msg), evidence) > > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > > > 7.egg/spambayes/classifier.py", line 169, in chi2_spamprob > > > clues = self._getclues(wordstream) > > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > > > 7.egg/spambayes/classifier.py", line 472, in _getclues > > > tup = self._worddistanceget(word) > > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > > > 7.egg/spambayes/classifier.py", line 487, in _worddistanceget > > > prob = self.probability(record) > > > File "/usr/lib/python2.7/site-packages/spambayes-1.1a6-py2. > > > 7.egg/spambayes/classifier.py", line 287, in probability > > > assert hamcount <= nham, "Token seen in more ham than ham > > > trained." > > > AssertionError: Token seen in more ham than ham trained. > > > It is possible I got a python update, but I wasn't paying > attention, > > > so > > > I'm not at all sure. > > > I'm NOT a python guru, so I'd appreciate any guidance any of you > can > > > provide. > > > thanks in advance! > > > Fred > > > -- > > > ---- Fred Smith -- [2]fredex at fcshome.stoneham.ma.us > > > ----------------------------- > > > The Lord detests the way of the wicked > > > but he loves those who pursue righteousness. > > > ----------------------------- Proverbs 15:9 (niv) > > > ----------------------------- > > > _______________________________________________ > > > [3]SpamBayes at python.org > > > [4]https://mail.python.org/mailman/listinfo/spambayes > > > Info/Unsubscribe: [5]http://mail.python.org/ > > > mailman/listinfo/spambayes > > > Check the FAQ before asking: [6]http://spambayes.sf.net/faq.html > > > > > > References > > > > > > 1. mailto:fredex at fcshome.stoneham.ma.us > > > 2. mailto:fredex at fcshome.stoneham.ma.us > > > 3. mailto:SpamBayes at python.org > > > 4. https://mail.python.org/mailman/listinfo/ > spambayesInfo/Unsubscribe > > > 5. http://mail.python.org/mailman/listinfo/spambayes > > > 6. http://spambayes.sf.net/faq.html > > > > -- > > ---- Fred Smith -- fredex at fcshome.stoneham.ma.us > ----------------------------- > > The Lord is like a strong tower. > > Those who do what is right can run to him for safety. > > --------------------------- Proverbs 18:10 (niv) > ----------------------------- > > -- > ------------------------------------------------------------ > ------------------- > .---- Fred Smith / > ( /__ ,__. __ __ / __ : / > / / / /__) / / /__) .+' Home: > fredex at fcshome.stoneham.ma.us > / / (__ (___ (__(_ (___ / :__ > 781-438-5471 > -------------------------------- Jude 1:24,25 > --------------------------------- > _______________________________________________ > SpamBayes at python.org > https://mail.python.org/mailman/listinfo/spambayes > Info/Unsubscribe: http://mail.python.org/mailman/listinfo/spambayes > Check the FAQ before asking: http://spambayes.sf.net/faq.html > -------------- next part -------------- An HTML attachment was scrubbed... URL: