Lucene and indexing

Michal Wallace sabren at manifestation.com
Fri Oct 18 13:48:10 EDT 2002


On Fri, 18 Oct 2002, Mr P wrote:

> I am looking for a fast, "industrial strength", incremental indexer for 
> Python. I have looked at ransacker and indexer.py, but they don't cut the 
> mustard. Are there any other I should look at?

Ransacker is a little baby. :) 

Did you look at the verison on sourceforge? If so, you saw
the old verison. I wrote a new version a couple weeks
ago. It's based on Metakit. It's MUCH faster, smaller, and
cleaner. There's an article about it here:

   http://cornerhost.com/lists/workshop-lite/2002-September/000008.html

It's only an indexer though. You can only query for one word
at a time, and it does do some primitive ranking... There's
no boolean search or phrase search though. (Though that
article explains how to build a boolean search if anyone's
interested)


The code is here:

   http://cvs.sabren.com/sixthdev/cvsweb.cgi/ransacker/


Of course, Ransacker is not meant to be anywhere near as
sophisticated as Lucene. Ransacker goes for quick and easy
and modular... Lucene's designed by a search engine pro, and
meant to be smart. :)


Cheers,

- Michal   http://www.sabren.net/   sabren at manifestation.com 
------------------------------------------------------------
Switch to Cornerhost!             http://www.cornerhost.com/
 Low Priced, Reliable Blog Hosting, With a Human Touch. :)
------------------------------------------------------------





More information about the Python-list mailing list