Replacing words from strings except 'and' / 'or' / 'and not'

Peter Otten __peter__ at web.de
Fri Nov 26 04:15:37 EST 2004


Peter Maas wrote:

> Diez B. Roggisch schrieb:
>> import sets
>> KEYWORDS = sets.Set(['and', 'or', 'not'])
>> 
>> query = "test and testing and not perl or testit or example"
>> 
>> def decorate(w):
>>     if w in KEYWORDS:
>>         return w
>>     return "*%s*" % w
>> 
>> query = " ".join([decorate(w.strip()) for w in query.split()])
> 
> Is there a reason to use sets here? I think lists will do as well.

Sets represent the concept better, and large lists will significantly slow
down the code (linear vs constant time). Unfortunately, as 2.3's Set is
implemented in Python, you'll have to wait for the 2.4 set builtin to see
the effect for small lists/sets. In the meantime, from a performance point
of view, a dictionary fares best:

$cat contains.py
from sets import Set

# we need more items than in KEYWORDS above for Set
# to even meet the performance of list :-(
alist = dir([]) 
aset = Set(alist)
adict = dict.fromkeys(alist)

$timeit.py -s"from contains import alist, aset, adict" "'not' in alist"
100000 loops, best of 3: 2.21 usec per loop
$timeit.py -s"from contains import alist, aset, adict" "'not' in aset"
100000 loops, best of 3: 2.2 usec per loop
$timeit.py -s"from contains import alist, aset, adict" "'not' in adict"
1000000 loops, best of 3: 0.337 usec per loop

Peter




More information about the Python-list mailing list