Fast full-text searching in Python (job for Whoosh?)

Dino dino at no.spam.ar
Sat Mar 4 23:12:28 EST 2023


On 3/4/2023 10:43 PM, Dino wrote:
> 
> I need fast text-search on a large (not huge, let's say 30k records 
> totally) list of items. Here's a sample of my raw data (a list of US 
> cars: model and make)

I suspect I am really close to answering my own question...

 >>> import time
 >>> lis = [str(a**2+a*3+a) for a in range(0,30000)]
 >>> s = time.process_time_ns(); res = [el for el in lis if "13467" in 
el]; print(time.process_time_ns() -s);
753800
 >>> s = time.process_time_ns(); res = [el for el in lis if "52356" in 
el]; print(time.process_time_ns() -s);
1068300
 >>> s = time.process_time_ns(); res = [el for el in lis if "5256" in 
el]; print(time.process_time_ns() -s);
862000
 >>> s = time.process_time_ns(); res = [el for el in lis if "6" in el]; 
print(time.process_time_ns() -s);
1447300
 >>> s = time.process_time_ns(); res = [el for el in lis if "1" in el]; 
print(time.process_time_ns() -s);
1511100
 >>> s = time.process_time_ns(); res = [el for el in lis if "13467" in 
el]; print(time.process_time_ns() -s); print(len(res), res[:10])
926900
2 ['134676021', '313467021']
 >>>

I can do a substring search in a list of 30k elements in less than 2ms 
with Python. Is my reasoning sound?

Dino




More information about the Python-list mailing list