"Standard" Full Text Search Engine

Diez B. Roggisch deets at nospam.web.de
Fri Oct 26 08:53:59 EDT 2007


Martin Marcher wrote:

> Hello,
> 
> is there something like a standard full text search engine?
> 
> I'm thinking of the equivalent for python like lucene is for java or
> ferret for rails. Preferrably something that isn't exactly a clone of
> one of those but more that is python friendly in terms of the API it
> provides.
> 
> Things I'd like to have:
> 
>  * different languages are supported (it seems most FTSs do only english)
>  * I'd like to be able to provide an identifier (if I index files in
> the filesystem that would be the filename, or an ID if it lives in a
> database, or whatever applies)
>  * I'd like to pass it just some (user defined) keywords with content,
> the actual content (as string, or list of strings or whatever) and to
> retrieve the results by search by keyword
>  * something like a priority should be assignable to different fields
> (like field: title(priority=10, content="My Draft"),
> keywords(priority=50, list_of_keywords))
> 
> Unnecessary:
> 
>  * built-in parsing of different files
> 
> The "standard" I'm referring to would be something with a large and
> active user base. Like... WSGI is _the_ thing to refer to when doing
> webapps it should be something like $FTS-Engine is _the_ engine to
> refer to.
> 
> any hints?

There are several python lucene implementations available, and recently here
a project called NUCULAR turned up. And there is ZCatalog, the
full-text-indexing technology used in Zope, but which should be usable
outside of zope.

But "the" search-technology doesn't exist. I personally would most probably
go for the lucene-based stuff, because there you possibly get auxiliary
tools written in java.

Diez



More information about the Python-list mailing list