From sethg at goodmanassociates.com  Tue Jan  2 05:03:22 2007
From: sethg at goodmanassociates.com (Seth Goodman)
Date: Mon, 1 Jan 2007 22:03:22 -0600
Subject: [spambayes-dev] FAQ 6.5
In-Reply-To: <001601c72c98$ca8feeb0$0201010a@goodgrief>
Message-ID: <MHEGIFHMACFNNIMMBACAKELMOCAA.sethg@goodmanassociates.com>

There are more important reasons to not bounce spam than internet
congestion.  A bounce is a class of automated message called a delivery
status notification (DSN).  A recipient MTA that accepts a message for
delivery must send a DSN to the return-path address if the MTA is unable
to make final delivery.  Spambayes runs in the MUA only after final
message delivery, so you can't say the message wasn't delivered :)  For
this reason, SMTP makes no provision for an MUA to ever send a DSN.

More importantly, there is no reliable bounce address in a message that
later turns out to be spam.  In fact, we know that the return-path is
virtually always forged.  Generating a bounce after acceptance will
abuse an innocent third party, if it is deliverable at all.

Many MTA's persisted for a number of years in promiscuously accepting
all messages for their domains and sending DSN's later for undeliverable
messages.  Operating an MTA this way is called a store-and-forward
configuration.  Once people started using IP blacklists, spammers
quickly realized that they could trick MTA's that were not blacklisted
into delivering their spam.  They would simply address the spam to an
undeliverable address at a domain with a good reputation, let's say
bogus at aol.com, and put the real target address into the return-path, say
victim at poorslob.com.  AOL's MTA accepts the message, since it purports
to be for an AOL customer.  Then it finds it had no mailbox named
'bogus' and sends a bounce message containing the spam to
victim at poorslob.com assuming they were the originator.  The MTA at
poorslob.com accepts all messages from aol.com, so it accepts and
delivers the spam and then blames AOL.

So the best answer as to why it is inappropriate to bounce spam is that
it turns your MTA into a spam reflector, which will properly get you
blacklisted for abuse.

--
Seth Goodman


From skip at pobox.com  Wed Jan 17 04:18:51 2007
From: skip at pobox.com (skip at pobox.com)
Date: Tue, 16 Jan 2007 21:18:51 -0600
Subject: [spambayes-dev] We could use some Windows help I think...
Message-ID: <17837.38299.391291.73313@montanaro.dyndns.org>

It would appear that all of our Windows programming expertise is more-or-
less permanently booked these days.  Most of the user questions to the
spambayes list seem to be related to Outlook or Outlook Express.  Those
questions tend to get answered most of the time, but getting a new release
tested, built and out the door (one is sorely needed I think) is tough
because of the Windows barriers.

I propose we solicit some new Windows programming help from the broader
Python community (http://wiki.python.org/moin/VolunteerOpportunities and/or
comp.lang.python).  Any thoughts about that?

Skip

From mhammond at skippinet.com.au  Thu Jan 18 03:34:39 2007
From: mhammond at skippinet.com.au (Mark Hammond)
Date: Thu, 18 Jan 2007 13:34:39 +1100
Subject: [spambayes-dev] We could use some Windows help I think...
In-Reply-To: <17837.38299.391291.73313@montanaro.dyndns.org>
Message-ID: <044401c73aa9$34557bd0$180a0a0a@enfoldsystems.local>

> It would appear that all of our Windows programming expertise
> is more-or-
> less permanently booked these days.  Most of the user questions to the
> spambayes list seem to be related to Outlook or Outlook
> Express.  Those
> questions tend to get answered most of the time, but getting
> a new release
> tested, built and out the door (one is sorely needed I think) is tough
> because of the Windows barriers.
>
> I propose we solicit some new Windows programming help from
> the broader
> Python community
> (http://wiki.python.org/moin/VolunteerOpportunities and/or
> comp.lang.python).  Any thoughts about that?

In general I think that is a great idea.

However, my quick scan of the support issues don't indicate that a new
version would actually reduce the number of problems.  I thought that most
of the new stuff relates to new features, rather than at addressing the
common problems people see.  I'm happy to help knock up a new release if I'm
wrong though.  Certainly, any new talent could help to address such issues
for a future release.

I have been playing a little more with the new OCR code and Outlook.  Sadly,
I'm not seeing much of a reduction in image spam.  My experience is
currently that ocrad is doing a poor job of extracting the text in these
spams (even with many options tweaked), but that gocr (as used by
SpamAssassin) does a much better job.  I haven't managed to run the tests
with this new code yet though.  The absence of any real interest from others
on spambayes-dev doesn't help my motiviation levels, so that is yet another
good reason to try and get more windows developers on board :)

Cheers,

Mark


From skip at pobox.com  Thu Jan 25 20:42:57 2007
From: skip at pobox.com (skip at pobox.com)
Date: Thu, 25 Jan 2007 13:42:57 -0600
Subject: [spambayes-dev] Rebuild/reinstall website?
Message-ID: <17849.2113.677489.425017@montanaro.dyndns.org>

I corrected some SF website url errors and checked in the relevant pages.
When I tried to make the website it complained about not finding html.py.
scripts/make.rules has this:

    # docutils 'html.py' script.
    DUHTML = html.py

I tried easy_install docutils but that didn't yield html.py.  It looks like
rst2html.py is the replacement, so I make that change and ran "make
install".  I got some errors (files I couldn't upload), but most of it
seemed to work okay.  If someone else can give it a try that would help
boost my confidence that it isn't just me.

Skip


From skip at pobox.com  Sat Jan 27 03:47:11 2007
From: skip at pobox.com (skip at pobox.com)
Date: Fri, 26 Jan 2007 20:47:11 -0600
Subject: [spambayes-dev] Any MoinMoin experts here?
Message-ID: <17850.48431.913754.336466@montanaro.dyndns.org>


Is there anyone here with experience working with the MoinMoin code base?  I
think using SpamBayes to deflect spam instead of the current
BadContent/LocalBadContent approach would be useful.  I wrote a couple
messages to the moin-users mailing list, but received no responses.  (In
scanning the archive I don't see my message.  Must have disappeared in a
black hole.)  In case someone's interested, here's what I wrote in my second
post:

    We all know wikis get spammed.  I'm not up-to-speed on the latest
    versions of MoinMoin, but I think the concept used at least through the
    1.3 series (the use of BadContent and LocalBadContent pages) is
    fundamentally flawed since it relies on the users to manually update
    "bad" words.  You're always trying to catch up with the spammers.

    Instead, let me suggest that you incorporate a SpamBayes-based
    classifier into MoinMoin.  I did this recently for a couple other
    websites I manage (Mojam and Musi-Cal - not wikis).  It worked
    marvelously there.  I now reject 100% of the spam submissions and also
    catch submission mistakes by good users that I would never have caught
    before.

    Here's how I envision it working.  Whenever a form submission happens
    the new page is scored against the current SpamBayes database.  If it
    scores as possible or probable spam, it is automatically reverted back
    to the last revision that scores as okay, and the full URL for that
    revision is mailed to all people in AdminGroup.  An admin reviews that
    URL.  If it's okay, the URL is added to the HamPages page.  If not, it's
    added to the SpamPages page (both suitably protected for AdminGroup
    write only and not themselves checked by SpamBayes).  Whenever those
    pages are saved the entire database is retrained from scratch.  This
    should not generally be a problem, as there will probably only be a few
    pages in the database, so retraining should be quick.  It should also be
    a relatively rare occurrence.  If the suspect page was actually ham,
    after retraining, score it again.  It should score as ham now.  If so,
    just revert to it.  If not, add it to the HamPages page a second time.
    I'm not entirely sure how to handle new pages which are spam, but I
    think you should be able to automatically DeletePage them, then revive
    them later if they turn out to be good.

    This all said, I can help from the SpamBayes side of things (write the
    tokenizer, suggest some synthetic tokens that might help improve the
    discrimination of ham and spam), but I'm not familiar with the MoinMoin
    code base, certainly not the latest versions.  It's unlikely that I
    could implement it quickly on that side of things.  If someone familiar
    with MoinMoin's code base would like to team up with me on this, let me
    know.  Together we should be able to knock this off very quickly.

Skip

From mhammond at skippinet.com.au  Mon Jan 29 03:30:49 2007
From: mhammond at skippinet.com.au (Mark Hammond)
Date: Mon, 29 Jan 2007 13:30:49 +1100
Subject: [spambayes-dev] Rebuild/reinstall website?
In-Reply-To: <17849.2113.677489.425017@montanaro.dyndns.org>
Message-ID: <05b601c7434d$7e0e2ef0$230a0a0a@enfoldsystems.local>

> I tried easy_install docutils but that didn't yield html.py.
> It looks like
> rst2html.py is the replacement, so I make that change and ran "make
> install".  I got some errors (files I couldn't upload), but most of it
> seemed to work okay.

I did find the need for the rst2html change, but as I mentioned in December,
I failed to build for all kinds of reasons I didn't dig in to and can't
recall the exact errors I had.  I'd assumed they were more to do with
Windows/cygwin etc, so didn't dig deeper.

> If someone else can give it a try that would help boost my confidence
> that it isn't just me.

I didn't get any responses to that Dec 22 mail asking for a linux-type
person to give it a go, so I'd suspect that is *is* just you, but only
because you are the only one trying to do it ;)

Cheers,

Mark