[Spambayes] Spambayes Digest, Vol 83, Issue 11

Oded odegani at pacbell.net
Fri Jul 8 08:07:14 CEST 2005


Thanks for your answers! here are my responses:

1.	I checked and Outlook has NOT disable spambayes.
2.	I didn't know a ratio of approximately 1:1 is required. I can't
understand how the ratio got corrupted in the first place... May be because
of my several attempts at retraining w/o first deleting the data base. I'll
try that approach now
3.	No. I didn't mean that the toolbar wasn't removed. I meant the
"add/remove" function of "Windows" didn't remove the Spambayes program. I
checked and it is still present in the Windows Explorer "Program Files" as
well as in: Start/All Programs!
4.	I noticed that the "Delete as Spam" button disappeared from my list.
This is another reason to remove the Spambayes altogether and reinstall it,
isn't it? However, I need to know how to properly remove Spambayes.

Thanks,
 

Oded   
______________________ 

 

  _____  

From: spambayes-bounces at python.org [mailto:spambayes-bounces at python.org] On
Behalf Of spambayes-request at python.org
Sent: Thursday, July 07, 2005 8:48 PM
To: spambayes at python.org
Subject: Spambayes Digest, Vol 83, Issue 11



Send Spambayes mailing list submissions to
        spambayes at python.org

To subscribe or unsubscribe via the World Wide Web, visit
        http://mail.python.org/mailman/listinfo/spambayes
or, via email, send a message with subject or body 'help' to
        spambayes-request at python.org

You can reach the person managing the list at
        spambayes-owner at python.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Spambayes digest..."


Today's Topics:

   1. Re: quit working (Tony Meyer)
   2. My good messages are directed to the junk mail box (Oded)
   3. Re: My good messages are directed to the junk mail box
      (Tony Meyer)
   4. Re: My good messages are directed to the junk mail box
      (Tony Meyer)
   5. Re: DB error... (Tony Meyer)
   6. Re: Fw: Undelivered Mail Returned to Sender (Tony Meyer)
   7. Re: Installing SpamBayes for Outlook Express 6 (Tony Meyer)
   8. Re: Installing SpamBayes for Outlook Express 6 (Tony Meyer)
   9. To label or not to label, a practical question (Michael D. Adams)
  10. Re: sb_imapfilter.py error with option -e y (Tony Meyer)
  11. Re: To label or not to label, a practical question (Tony Meyer)
  12. Re: My good messages are directed to the junk mail box
      (Tony Meyer)


----------------------------------------------------------------------

Message: 1
Date: Fri, 8 Jul 2005 11:42:27 +1200
From: "Tony Meyer" <tameyer at ihug.co.nz>
Subject: Re: [Spambayes] quit working
To: <jackieh at gslc.com>, <spambayes at python.org>
Message-ID:
        <ECBA357DDED63B4995F5C1F5CBE5B1E801DB0374 at its-xchg4.massey.ac.nz>
Content-Type: text/plain;       charset="us-ascii"

> WE were using spambayes all the time, then it just quit working.

See if Outlook has disabled the plug-in: Help->About Microsoft
Outlook->Disabled Items.

=Tony.Meyer

--
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes.
http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this.



------------------------------

Message: 2
Date: Thu, 7 Jul 2005 16:55:22 -0700
From: "Oded" <odegani at pacbell.net>
Subject: [Spambayes] My good messages are directed to the junk mail
        box
To: <spambayes at python.org>
Message-ID: <20050707235527.8CEA61E4004 at bag.python.org>
Content-Type: text/plain; charset="iso-8859-1"

All of a sudden, after a year of good experience with spambayes, all of my
good messages are directed to the junk mail box while many junk mail ends up
in my "In" box. I was instructed to look at log file (attached), but there
is so much data there I am not sure what to look for?

Attempts to retrain the system didn't help

I tried to remove Spam Bayes using the Windows "Add/Remove", but it didn't
remove it!!! Why? Re-Installation didn't help. How can I proceed?

Thanks,

Oded 

Oded 
______________________


-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://mail.python.org/pipermail/spambayes/attachments/20050707/ca91b903/att
achment-0001.htm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: spambayes1.log
Type: application/octet-stream
Size: 13035 bytes
Desc: not available
Url :
http://mail.python.org/pipermail/spambayes/attachments/20050707/ca91b903/spa
mbayes1-0001.obj

------------------------------

Message: 3
Date: Fri, 8 Jul 2005 12:00:23 +1200
From: "Tony Meyer" <tameyer at ihug.co.nz>
Subject: Re: [Spambayes] My good messages are directed to the junk
        mail box
To: "'Oded'" <odegani at pacbell.net>, <spambayes at python.org>
Message-ID:
        <ECBA357DDED63B4995F5C1F5CBE5B1E801DB0375 at its-xchg4.massey.ac.nz>
Content-Type: text/plain;       charset="us-ascii"

> All of a sudden, after a year of good experience with
> spambayes, all of my good messages are directed to the
> junk mail box while many junk mail ends up in my "In" box.

The most important part of the log is probably this:

> Bayes database initialized with 5 spam and 536 good messages

SpamBayes works best with roughly even amounts of trained ham and spam.  You
have over a ratio of over 100::1, which is almost certain to cause problems.
Retraining from scratch is probably the best option.  There's lots of
information about training at <http://entrian.com/sbwiki/TrainingIdeas> if
you're interested.

> I tried to remove Spam Bayes using the Windows "Add/Remove",
> but it didn't remove it!!!

You probably mean that the toolbar wasn't removed.  This is a known bug; see
FAQ 3.16:

<http://spambayes.org/faq.html#how-do-i-uninstall-the-plug-in>

=Tony.Meyer

--
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes.
http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this.



------------------------------

Message: 4
Date: Fri, 8 Jul 2005 14:35:47 +1200
From: "Tony Meyer" <tameyer at ihug.co.nz>
Subject: Re: [Spambayes] My good messages are directed to the junk
        mail box
To: "'KD5NWA'" <kd5nwa at cox.net>
Cc: spambayes at python.org
Message-ID:
        <ECBA357DDED63B4995F5C1F5CBE5B1E801DB0379 at its-xchg4.massey.ac.nz>
Content-Type: text/plain;       charset="us-ascii"

>> Retraining from scratch is probably the best option.
>
> So how do you that? I have about a 400:1 Ham to Spam ratio at
> this momment.

You can either delete the existing databases (default_bayes_database.db,
default_message_database.db) manually (they are in the 'data directory';
SpamBayes->SpamBayes Manager->Advanced->Show Data Directory will open it),
or use the "Training" tab (or the wizard, probably).

If you use the "Training" tab, just be sure to tick the "Rebuild entire
database" box.

The training method that we recommend is to start with little or no
training, and train only on misclassified and unsure messages using the
"Delete as Spam" and "Recover from Spam" buttons.  This generally gives the
best results.  One other thing that can help avoid imbalance is adjusting
the ham/spam thresholds ("Filtering" tab) once you have the classifier
reasonably trained.

=Tony.Meyer

--
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes.
http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this.



------------------------------

Message: 5
Date: Fri, 8 Jul 2005 15:07:44 +1200
From: "Tony Meyer" <tameyer at ihug.co.nz>
Subject: Re: [Spambayes] DB error...
To: <seqenenra at hotpop.com>
Cc: spambayes at python.org
Message-ID:
        <ECBA357DDED63B4995F5C1F5CBE5B1E801DB037A at its-xchg4.massey.ac.nz>
Content-Type: text/plain;       charset="us-ascii"

> I am not sure what I did to spambayes, but it sure isn't
> liking me too much. Everything was running great today after 
> I started using the pickle, and then this evening it crapped
> out with the following.
[...]
> EOFError

This means that the (pickle) database is corrupt.  Most likely the same
problems that cause problems with bsddb databases in 1.1a1.

Unless you really want/need features that are new in 1.1, I highly recommend
using 1.0.4, at least until there is a 1.1a2 release.

=Tony.Meyer

--
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes.
http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this.



------------------------------

Message: 6
Date: Fri, 8 Jul 2005 15:10:24 +1200
From: "Tony Meyer" <tameyer at ihug.co.nz>
Subject: Re: [Spambayes] Fw: Undelivered Mail Returned to Sender
To: "'Trevor Lawrence'" <tandcl at homemail.com.au>,
        <spambayes at python.org>
Message-ID:
        <ECBA357DDED63B4995F5C1F5CBE5B1E801DB037B at its-xchg4.massey.ac.nz>
Content-Type: text/plain;       charset="us-ascii"

> I attempted to forward a Spam message to localhost to train
> SpamBayes and it failed
[...]
> > <spambayes_spam at localhost>

It looks like you're using Outlook Express - we really don't recommend
training with the SMTP proxy for Outlook Express users (OE is too limited in
what it can do).

This specific error looks like the SMTP proxy isn't setup (or OE isn't set
to *send* mail via localhost).  However, training via the web interface
would definitely be a better idea.

=Tony.Meyer

--
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes.
http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this.



------------------------------

Message: 7
Date: Fri, 8 Jul 2005 15:12:24 +1200
From: "Tony Meyer" <tameyer at ihug.co.nz>
Subject: Re: [Spambayes] Installing SpamBayes for Outlook Express 6
To: "'Trevor Lawrence'" <tandcl at homemail.com.au>
Cc: spambayes at python.org
Message-ID:
        <ECBA357DDED63B4995F5C1F5CBE5B1E801B0F7BC at its-xchg4.massey.ac.nz>
Content-Type: text/plain;       charset="us-ascii"

> All well and good. I did as in FAQ 4.21 and a test message
> got through.
>
> However, I now cannot open localhost:8880 from the icon on
> the tray. I try clicking on Review Messages View Information Configire
>
> All return
>       The page cannot be displayed
>       The page you are looking for is currently unavailable.
> The Web site might be experiencing technical difficulties, or you
> may need to adjust your browser settings.
>
> However, a message did get through which I want to classify
> as Spam, but I can't

What if you try <http://127.0.0.1:8080>, does that work?

Could you find your SpamBayes log files (in your temp directory, \Documents
and Settings\{username}\Local Settings\temp in WinXP, called
SpambayesServerX.log, where X is 1-4) and send us a copy of them?

=Tony.Meyer

--
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes.
http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this.



------------------------------

Message: 8
Date: Fri, 8 Jul 2005 15:16:02 +1200
From: "Tony Meyer" <tameyer at ihug.co.nz>
Subject: Re: [Spambayes] Installing SpamBayes for Outlook Express 6
To: "'Trevor Lawrence'" <tandcl at homemail.com.au>
Cc: spambayes at python.org
Message-ID:
        <ECBA357DDED63B4995F5C1F5CBE5B1E801DB037D at its-xchg4.massey.ac.nz>
Content-Type: text/plain;       charset="us-ascii"

[Me]
> What if you try <http://127.0.0.1:8080>, does that work?

I meant <http://127.0.0.1:8880>, of course.

=Tony.Meyer

--
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes.
http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this.



------------------------------

Message: 9
Date: Thu, 7 Jul 2005 22:25:45 -0500
From: "Michael D. Adams" <mdmkolbe at gmail.com>
Subject: [Spambayes] To label or not to label, a practical question
To: spambayes at python.org
Message-ID: <c62c8d8605070720256eff35f3 at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

My ISP provides a spam filtering service (server side) that labels the
things that they think are spam by putting an extra string in the
subject like (e.g. "--Spam--" at the front).  Their filters don't
catch everything so I want to also use SpamBayes to eliminate the spam
that my ISP doesn't label.  My question is whether or not I should
train SpamBayes with the spams that get labeled by my ISP.  I could
easily see SpamBayes picking up on the "--Spam--" string in the
subject line and filtering just based on that.  On the other hand
maybe that would introduce some selection bias or a bad spam vs ham
ratio for training (e.g. maybe I'll get 50 ham, 40 spam caught by my
ISP, and 10 spam not caught by my ISP (I don't know what the ratio is
yet, I only just started using my ISP's filter)).

Does anyone have any advice on whether these might interfere or how to
avoid that interference?  Should I even be using my ISP's filter along
with SpamBayes or just SpamBayes by itself?

Michael D. Adams
mdmkolbe at gmail.com


------------------------------

Message: 10
Date: Fri, 8 Jul 2005 15:25:52 +1200
From: "Tony Meyer" <tameyer at ihug.co.nz>
Subject: Re: [Spambayes] sb_imapfilter.py error with option -e y
To: "'Rafael Scholl'" <rafael.scholl at bbox.ch>, <spambayes at python.org>
Message-ID:
        <ECBA357DDED63B4995F5C1F5CBE5B1E801B0F7BD at its-xchg4.massey.ac.nz>
Content-Type: text/plain;       charset="us-ascii"

>> Could you rerun that with "-i4" as well ("sb_imapfilter.py
>> -c -v -e y -i4"), [...] and send that to us?  The extra debugging
>> info will help figure out what the problem is.
>
> Here it is:
[...]
>    28:07.50 > EDGG46 SELECT ""
>    28:07.51 < EDGG46 BAD too few arguments for SELECT
[...]
> imaplib.error: SELECT command error: BAD ['too few arguments
> for SELECT']

Turns out that this is a bug.  I can send you a patch, a new
sb_imapftiler.py, or instructions about how to modify your copy to fix it,
if you would like (it's a very simple change).  Let me know which you'd
prefer.

(Or if you don't use the -e option, this won't happen, since the bug is in
the expunging code).

=Tony.Meyer

--
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes.
http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this.



------------------------------

Message: 11
Date: Fri, 8 Jul 2005 15:35:31 +1200
From: "Tony Meyer" <tameyer at ihug.co.nz>
Subject: Re: [Spambayes] To label or not to label, a practical
        question
To: "'Michael D. Adams'" <mdmkolbe at gmail.com>, <spambayes at python.org>
Message-ID:
        <ECBA357DDED63B4995F5C1F5CBE5B1E801B0F7BE at its-xchg4.massey.ac.nz>
Content-Type: text/plain;       charset="us-ascii"

> My ISP provides a spam filtering service (server side) that labels the
> things that they think are spam by putting an extra string in the
> subject like (e.g. "--Spam--" at the front).  Their filters don't
> catch everything so I want to also use SpamBayes to eliminate the spam
> that my ISP doesn't label.  My question is whether or not I should
> train SpamBayes with the spams that get labeled by my ISP.  I could
> easily see SpamBayes picking up on the "--Spam--" string in the
> subject line and filtering just based on that.

It's possible (even likely, assuming that their filter is any good) that
"subject:--Spam--" will become a strong spam clue, but hopefully there would
be enough ham clues in the (ISP's) false positive that SpamBayes would still
be able to make a correct (or perhaps unsure) classification.

The only way to know for sure is to give it a go.  You can see how strong
the "subject:--Spam--" clue is by looking at the clues for a message with
such a modified subject.

The situation with my work email is similar - I can opt out of their spam
filtering, but that means that they will prepend "[SPAM]" to the subject.  I
ignore their classification and SpamBayes still works fine.  However, they
have some really terrible false positives, which means that "subject:[SPAM]"
isn't as strong a clue as it would be otherwise.

> On the other hand maybe that would introduce some selection bias or a
> bad spam vs ham ratio for training (e.g. maybe I'll get 50 ham, 40 spam
> caught by my ISP, and 10 spam not caught by my ISP (I don't know what
> the ratio is yet, I only just started using my ISP's filter)).

It's all guesswork at the moment, but you might find that it helps with
keeping a ~1::1 ratio.  Ham tends to be reasonably homogenous, so you
generally need to train on less of it than spam (assuming you're doing some
sort of train-on-errors-and-unsures training), so this might help balance
that out.

> Does anyone have any advice on whether these might interfere or how to
> avoid that interference?  Should I even be using my ISP's filter along
> with SpamBayes or just SpamBayes by itself?

If the ISP's filter is reasonably good, then you might as well as it as
well; plenty of people like these sort of tiered filter systems.

I expect that you'll find that it doesn't interfere at all; the only way to
know for sure is to try it out, though.  (Maybe after training for a while,
you could get someone to send you a ham message with "--Spam--" in the
subject, and see if the hammy clues are enough to get it through).  Let us
know if you find out anything interesting!  :)

=Tony.Meyer

--
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes.
http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this.



------------------------------

Message: 12
Date: Fri, 8 Jul 2005 15:48:04 +1200
From: "Tony Meyer" <tameyer at ihug.co.nz>
Subject: Re: [Spambayes] My good messages are directed to the junk
        mail box
To: "'KD5NWA'" <kd5nwa at cox.net>
Cc: spambayes at python.org
Message-ID:
        <ECBA357DDED63B4995F5C1F5CBE5B1E801DB0380 at its-xchg4.massey.ac.nz>
Content-Type: text/plain;       charset="us-ascii"

> Sound like Outlook, I do not use Outlook, I use the Proxy server and
> Eudora for the mail client.

Sorry.  Then just delete the databases manually.  The configuration page
shows the path to the data directory.

[...]
> In another PC I turned on suppress caching of bulk ham after about
> two weeks of training it has a 26:1 HAM to SPAM ratio.

That's still much higher than we recommend.  However, if it's working, then
there's no point changing it.

> I belong to several email list that are moderated, there is zero spam
> coming from those list but there are 30 to 100 HAM emails coming
> daily from those list.

Using the option to not cache bulk mail (configuration page) would help with
this; you wouldn't then see any of these (assuming that the mailing list
uses the "Precedence" header) in the review pages.

> I suppose I could for a while change the settings so HAM mail has the
> discard option as the default that would stop building up HAM email
> until I stop it.

Ideally (see <http://entrian.com/sbwiki/TrainingIdeas>) you should set both
ham and spam to discard, and just train any false positives (ham identified
as spam), false negatives (spam identified as ham) and unsures.

> I assume by adjusting the threshold you mean lower it for HAM
> and SPAM?

Yes.  For example, the default thresholds for the proxy are 0.2 and 0.9 -
you will probably find after training for a while that you can change these
to something like 0.1 and 0.8.

=Tony.Meyer

--
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes.
http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this.



------------------------------

_______________________________________________
Spambayes at python.org
http://mail.python.org/mailman/listinfo/spambayes
Check the FAQ before asking: http://spambayes.sf.net/faq.html

End of Spambayes Digest, Vol 83, Issue 11
***************************************** 



  _____  

avast!  <http://www.avast.com> Antivirus: Inbound message clean. 


Virus Database (VPS): 0527-1, 07/07/2005
Tested on: 7/7/2005 8:52:20 PM
avast! - copyright (c) 2000-2004 ALWIL Software.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/spambayes/attachments/20050707/57a7e6c6/attachment-0001.htm


More information about the Spambayes mailing list