Storing pairs of (int, int) in a database : which db to choose ?

Jp Calderone exarkun at intarweb.us
Tue Dec 23 17:54:13 EST 2003


On Tue, Dec 23, 2003 at 12:54:39PM -0800, Stormbringer wrote:
> Jp Calderone <exarkun at intarweb.us> wrote in message news:<mailman.75.1072189772.684.python-list at python.org>...
> > On Tue, Dec 23, 2003 at 04:35:50AM -0800, Stormbringer wrote:
> > > Hi,
> > > 
> > > I want to implement a fulltext search for messages in a forum. More
> > > exactly for each message I store pairs (wordId, msgId) for each
> > > identified word and when I search something I want to be able to
> > > retrieve very quickly all msgId for a given wordId.
> > > 
> > 
> >   A pure Python fulltext indexer - http://divmod.org/Lupy/index.html
> 
> Thanks ! This is exactly what I needed, and the size of the indexes is
> around 30%, much much less than what I could have achieved with my
> code. Not to mention the fact that I get phrase search and some other
> goodies :)
> 
> The only thing that bothers me a little is the speed for building the
> index, I tried with around 5000 messages and I am not quite thrilled,
> it's not _extremly_ slow but it has to be faster for what I need.
> Perhaps I'll use the C++ version with some Python bindings.
> 

  Yea, I hear that.  Work is being done on speeding it up (pretty much the
only development on it now is optimization).  I don't know how it will end
up, but things look promising so far.  On the other hand, if you don't want
to wait for that to be finished...

  Jp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 196 bytes
Desc: Digital signature
URL: <http://mail.python.org/pipermail/python-list/attachments/20031223/e2b05df0/attachment.sig>


More information about the Python-list mailing list