Find similar images using python

nikie n.estner at gmx.de
Wed Mar 29 12:15:34 EST 2006


> How can I use python to find images that looks quite similar? Thought
> I'd scale the images down to 32x32 and convert it to use a standard
> palette of 256 colors then compare the result pixel for pixel etc, but
> it seems as if this would take a very long time to do when processing
> lots of images.
>
> Any hint/clue on this subject would be appreciated.

A company I used to work for has been doing research in this area
(finding differences between images) for years, and the results are
still hardy generalizable, so don't expect to get perfect results after
a weekend ;-)

I'm not sure what you mean by "similar": I assume for the moment that
you want to detect if you really have the same photo, but scanned with
a different resolution, or with a different scanner or with a digital
camera that's slightly out of focus. This is still hard enough!

There are many approaches to this problem, downsampling the image might
work (use supersampling!), but it doesn't cover rotations, or different
borders or blur..., so you'll have to put some additional efforts into
the comparison algorithm. Also, converting the images to a paletted
format is almost definitly the wrong way - convert them to greyscale,
or work on 24 bit (RGB or HSV).
Another approach that you might try is comparing the image histograms:
they aren't affected by geometric transformations, and should still
contain some information about the original image. Even if they aren't
sufficient, they might help you to narrow down your search, so you have
more processing time for advanced algorithms.

If you have performance problems, NumPy and Psyco might both be worth a
look.




More information about the Python-list mailing list