Fast full-text searching in Python (job for Whoosh?)

Dino dino at no.spam.ar
Mon Mar 6 21:55:37 EST 2023


On 3/4/2023 10:43 PM, Dino wrote:
> 
> I need fast text-search on a large (not huge, let's say 30k records 
> totally) list of items. Here's a sample of my raw data (a list of US 
> cars: model and make)

Gentlemen, thanks a ton to everyone who offered to help (and did help!). 
I loved the part where some tried to divine the true meaning of my words :)

What you guys wrote is correct: the grep-esque search is guaranteed to 
turn up a ton of false positives, but for the autofill use-case, that's 
actually OK. Users will quickly figure what is not relevant and skip 
those entries, just to zero on in on the suggestion that they find relevant.

One issue that was also correctly foreseen by some is that there's going 
to be a new request at every user key stroke. Known problem. JavaScript 
programmers use a trick called "debounceing" to be reasonably sure that 
the user is done typing before a request is issued:

https://schier.co/blog/wait-for-user-to-stop-typing-using-javascript

I was able to apply that successfully and I am now very pleased with the 
final result.

Apologies if I posted 1400 lines or data file. Seeing that certain 
newsgroups carry gigabytes of copyright infringing material must have 
conveyed the wrong impression to me.

Thank you.

Dino



More information about the Python-list mailing list