[Tutor] terminology

Danny Yoo dyoo at hkn.eecs.berkeley.edu
Tue Dec 6 08:16:39 CET 2005


> > suppose you have a list of words and you want to unambiguously identify
> > each word in the list with the shortest number of characters.
> > for instance a list like : kill, kiss, take
> > i would want to get take just by typing t.
> > but you would have to type kil or kis to get kill or kiss.
> > where should i start googling? is this a matter of sorting?
> > i was thinking of trees for some reason.
>
> Look for tries :-) http://en.wikipedia.org/wiki/Trie or
> http://www.nist.gov/dads/HTML/trie.html
>
> I think there's a couple of good python implementations around that you
> could google for..

Hi David,

Alternatively, the 'bisect' module might be "good enough".

    http://www.python.org/doc/lib/module-bisect.html

bisect_left() will get us quickly to the right leftmost position (or the
leftmost right position?  *grin*), if we assuming the list of words is
sorted.

For example:

#######
>>> import bisect
>>> words = [x.strip() for x in open('/usr/share/dict/words').readlines()]
>>> words.sort()
>>> bisect.bisect_left(words, 'kiss')
114279
>>> words[114280]
'kissability'
>>> words[114281]
'kissable'
>>> words[114282]
'kissableness'
#######

The idea is that we can just continue marching across our sorted list of
words till we stop finding candidates, once we find our starting position.



More information about the Tutor mailing list