Fwd: NUCULAR fielded text searchable indexing

Paul Rubin http
Thu Oct 11 12:32:15 EDT 2007


aaron.watters at gmail.com writes:
> > ...but it looks a little more akin to Solr than to Lucene. ...
> 
> I'm not sure but I think nucular has aspects of both since
> it implements both the search engine itself and also provides
> XML and HTTP interfaces

That sounds reasonable. 

> As a test I built an index with 10's of millions of entries
> using nucular and most queries through CGI processes clocked
> in in 100's of milliseconds or better -- which is quite acceptable,
> for many purposes.

How many items did each query return?  When I refer to large result
sets, I mean you often get queries that return 10k items or more (a
pretty small number: typing "python" into google gets almost 30
million hits) and you need to actually examine each item, as opposed
to displaying ten at a time or something like that (e.g. you want to
present faceted results).

> > So we're back to the perennial topic of parallelism in Python...
> 
> ...Which is not such a big problem if you rely on disk caching
> to provide the RAM access and use multiple processes to access
> the indices.

Right, another helpful strategy might be to use a solid state disk:

  http://www.newegg.com/Product/Product.aspx?Item=N82E16820147021



More information about the Python-list mailing list