image matching algorithms

Yu-Xi Lim yuxi at ece.gatech.edu
Wed Mar 12 20:44:50 EDT 2008


Daniel Fetchinson wrote:
> The photos are just coming straight from my digital camera. Same
> format (JPEG), varying size (6-10 megapixel) and I would like to be
> able to pick one and then query the database for similar ones. For
> example: I pick a photo which is more or less a portrait of someone,
> the query should return other photos with more or less portraits. If I
> pick a landscape with lot of green and a mountain the query should
> result in other nature (mostly green) photos. Something along these
> lines, of course the matches won't be perfect because I'm looking for
> a simple algorithm, but something along these lines.
> 

Ah. In that case, SIFT isn't for you. SIFT would work well if you have 
multiple photos of the same object. Say, a building from different 
angles, or the a vase against different backdrops.

If I'm understanding your correctly, what you're attempting here is very 
general and well into the highly experimental. I've been wishing for 
such a feature to appear in something like Google Image Search 
(pick/submit a photo and return similar images found on the web). I'm 
sure if there's even a practical solution, Google (or MS) would be on it 
already.

The problem is that there isn't really one. Despite what you may see 
claimed in university press releases and research papers, the current 
crop of algorithms don't work very well, at least according to my 
understanding and discussion with researchers in this field. The glowing 
results tend to be from tests done under ideal conditions and there's no 
real practical and commercial solution.

If you restrict the domain somewhat, there are some solutions, but none 
trivial. You are probably aware of the face searches available on Google 
and Live.

The histogram approach suggested by Shane Geiger may work for some cases 
and in fact would work very well for identical resized images. I doubt 
it will work for the general case. A mountain with a grassy plain at 
noon has quite a different histogram from one at sunset, and yet both 
have related content. Manual tagging of the images, a la Flickr, would 
probably be your best bet.

Good luck.



More information about the Python-list mailing list