[spambayes-dev] Need slightly better logic for blinking gifs

skip at pobox.com skip at pobox.com
Thu Aug 31 19:14:10 CEST 2006


    Kenny> Could we extract a list of text tokens from each frame
    Kenny> separately, and then choose the token list that has the most
    Kenny> tokens in it?

In theory, yes, though that would require running ocrad on each possibly
partial image (could get expensive) and would require code restructuring.
At the moment, the images come in one of three forms:

    * a single non-blinking image

    * a set of images, non-blinking, which, when assembled, make a single
      larger image

    * a single blinking image

Right now, I assume there might be multiple parts to the image, so I convert
from the source to PIL's internal format, concatenate them together, then
run ocrad on the total image.

I imagine it's not going to be long before the spammers start splitting up
their blinking images into parts.

Skip


More information about the spambayes-dev mailing list