NUCULAR fielded text searchable indexing

aaron.watters at gmail.com aaron.watters at gmail.com
Wed Oct 10 09:45:13 EDT 2007


On Oct 10, 6:05 am, Paul Boddie <p... at boddie.org.uk> wrote:
> On 9 Okt, 22:32, aaron.watt... at gmail.com wrote:
> ...tell us how the [ http://nucular.sourceforge.net ]
> software compares to stuff like Lucene or Xapian...

I wish I could, honestly.  I've looked briefly into trying
to put together some sort of comparisons, but I find the
documentation for both the systems mentioned quite forbidding.
I certainly don't want to spend as much time developing
comparisons with other projects as I did developing
nucular :).

For the moment I will make the completely unbiased
suggestion that nucular indices may be a lot easier
to set up and use than either Lucene or Xapian,  particularly
from a Python programming perspective.

It's also not immediately clear to me whether Xapian and
Lucene support completely unrestricted numbers and combinations
of fields, but I'm not sure.

I'll see if I can come up with something better than
that...

As a side note, if you do benchmarks, please don't use
the Lucene benchmark query taken from

http://lucene.apache.org/java/docs/benchmarks.html

namely,

Query: +Domain:sos +(+((Name:goo*^2.0 Name:plan*^2.0) (Teaser:goo* Tea
ser:plan*) (Details:goo* Details:plan*)) -Cancel:y) +DisplayStartDate:
[mkwsw2jk0 -mq3dj1uq0] +EndDate:[mq3dj1uq0-ntlxuggw0]

Because I expect nucular will perform very poorly on
this query (since it can't even implement it).  I put
a set of query features that I thought was
suitable for most purposes into nucular.  Others may appear
in later releases, but the one's that are there cover the
most common needs, I think.  I would prefer benchmarks
that compared simple common examples, not obscure
complicated ones.

  -- Aaron Watters

===
if you want a friend, get a dog.
   -- Truman





More information about the Python-list mailing list