[spambayes-dev] Choosing which image to OCR

Robert Mezzone rmezzone at pjsolomon.com
Wed Sep 6 09:54:07 CEST 2006

This probably doesn't help u much but as an fyi:

We run Trend ScanMail 7.0 on our Exchange server and until a few months ago it kept our mailboxes spam free. Being a long time user or spambayes, even I was suprised how well their spam filter worked. Then we started seeing a lot of spam, mainly the image based stuff, get past their filter. I called them and they told me they were developing a new scan engine. I haven't checked the logs to see when it was updated but they figuered out a way to dectect this junk because our mailboxes are once again spam free.

-----Original Message-----
From: spambayes-dev-bounces at python.org <spambayes-dev-bounces at python.org>
To: Mark Hammond <mhammond at skippinet.com.au>
CC: spambayes-dev at python.org <spambayes-dev at python.org>
Sent: Wed Sep 06 00:05:19 2006
Subject: Re: [spambayes-dev] Choosing which image to OCR

    >> I find it hard to believe it's in response to what we're doing here.
    >> I'm sure some other much bigger groups must be doing OCR analysis of
    >> image-based spam these days.

    Mark> Apparently SpamAssassin is developing OCR support.  I found some
    Mark> vaguely interesting info from this slashdot discussion:
    Mark> http://it.slashdot.org/article.pl?sid=06/09/04/1712233

Thanks for the pointer.  That led me to gocr and libgocr
(http://jocr.sf.net/).  That seems to be what the SA folks are using for
OCR.  At first blush, gocr seems to be about as good (or as bad) as ocrad,
though the text it generates appears to be bad in some dimension orthogonal
to ocrad's badness.  I haven't looked at libgocr yet.


spambayes-dev mailing list
spambayes-dev at python.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/spambayes-dev/attachments/20060906/b58e2536/attachment.htm 

More information about the spambayes-dev mailing list