Fast full-text searching in Python (job for Whoosh?)

avi.e.gross at gmail.com avi.e.gross at gmail.com
Tue Mar 7 14:02:03 EST 2023


Some of the discussions here leave me confused as the info we think we got
early does not last long intact and often morphs into something else and we
find much of the discussion is misdirected or wasted.

Wouldn't it have been nice if this discussion had not started with a mention
of a package/module few have heard of along with a vague request on how best
to search for lines that match something in a file?

I still do not know enough to feel comfortable even after all this time. It
now seems to be a web-based application in which a web page wants to use
autocompletion as the user types.

So was the web page a static file that the user runs, or is it dynamically
created by something like a python program? How is the fact that a user has
typed a letter in a textbox or drop down of sorts reflected in a request
being sent to a python program to return possible choices? Is the same
process called anew each time or is it, or perhaps a group of similar
processes or threads going to stick around and be called repeatedly?

Lots of details are missing and in particular, much of what is being
described sounds like it is happening in the browser, presumably in
JavaScript. Also noted is that the first keystroke or two may return too
much data.

So does the OP still think this is a python question? So much of the
discussion sounds like it is in the browser deciding whether to wait for the
user to type more before making a request, or throwing away results of an
older request.

So my guess is that a possible design for this amount of data may simply be
to read the file into the browser at startup, or when the first letter is
typed, and do all the searches internally, perhaps cascaded as long as
backspace or editing is not used.

If the data gets much larger, of course, then using a server makes sense
albeit it need not use python unless lots more in the project is also ...

-----Original Message-----
From: Python-list <python-list-bounces+avi.e.gross=gmail.com at python.org> On
Behalf Of David Lowry-Duda
Sent: Tuesday, March 7, 2023 1:29 PM
To: python-list at python.org
Subject: Re: Fast full-text searching in Python (job for Whoosh?)

On 22:43 Sat 04 Mar 2023, Dino wrote:
>How can I implement this? A library called Whoosh seems very promising 
>(albeit it's so feature-rich that it's almost like shooting a fly with 
>a bazooka in my case), but I see two problems:
>
> 1) Whoosh is either abandoned or the project is a mess in terms of 
>community and support 
>(https://groups.google.com/g/whoosh/c/QM_P8cGi4v4 ) and
>
> 2) Whoosh seems to be a Python only thing, which is great for now, 
>but I wouldn't want this to become an obstacle should I need port it to 
>a different language at some point.

As others have noted, it sounds like relatively straightforward 
implementations will be sufficient.

But I'll note that I use whoosh from time to time and I find it stable 
and pleasant to work with. It's true that development stopped, but it 
stopped in a very stable place. I don't recommend using whoosh here, but 
I would recommend experimenting with it more generally.

- DLD
-- 
https://mail.python.org/mailman/listinfo/python-list



More information about the Python-list mailing list