ANN: NUCULAR B3 Full text indexing (now on Win32 too)

Aaron Watters aaron.watters at gmail.com
Mon Feb 25 10:28:10 EST 2008


On Feb 22, 5:31 pm, Paul Rubin <http://phr...@NOSPAM.invalid> wrote:
> Aaron Watters <aaron.watt... at gmail.com> writes:
> > 3) ...I was thinking
> > of adding an optional feature to Nucular which would allow
> > a look-up like "given a word find all attributes that contain
> > that word anywhere and give a count of the number of times it
> > is found in that attribute as well as the entry id for an example
> > instance (arbitrarily chosen).  I was thinking about calling
> > this "inverted faceting....
>
> In Solr this is called the DisMax (disjunction maximum) handler,

I can't find much documentation on this, but I think this is not
what I was thinking of.  In fact I think Nucular already supports
"disjunction maximum".

I was thinking of a situation that would
support interactions like this (quickly and cheaply):

   User:  I'm thinking of "Denver"
   System:  I see the value "Denver" in the following contexts:
       City: Denver [100000 entries]
         (for example in "Colorado Trombone Players Association")
       Surname: Denver [100000 entries]
         (for example "Denver, John, songwriter")
       Title: Denver [1000 entries]
         (for example in "Stuck in Denver Again, by Albert Smiley")
       ... and also some other contexts
       Which do you mean?
   User: I'm actually looking for the surname...

In other words you don't get "documents" containing
the search term(s) but statistics on how many documents
contain each search term in a given context.

I'm pretty sure there must be a standard name for this kind
of thing, anybody?  Thanks!
   -- Aaron Watters

===
http://www.xfeedme.com/nucular/pydistro.py/go?FREETEXT=hackery
http://nucular.sourceforge.net/



More information about the Python-list mailing list