searching algorithm
Neil Cerutti
horpner at yahoo.com
Thu May 10 18:12:33 EDT 2007
On 2007-05-10, Gordon Airporte <JHoover at fbi.gov> wrote:
>
>> For the above (abrideged) dictionary, you would generate (use a
>> fixed-width "programmers" font so the tree looks good):
>>
>> a
>> |
>> b
>> |
>> s
>> / \
>> i o
>> / / \
>> n l r
>> / / \ \
>> t u v b->(absorbirati, crpisti)
>> / | |
>> (pelin)<-h t e->(odrije?iti, osloboditi)
>> | |
>> (pelin)<-e e->(apsolutan, apsolutni kod)
>>
>> As the user enter letters, you just march down the tree, printing
>> all the words held in leaf nodes held in the current node.
>
> Call me dense, but how does one do this in Python - which
> doesn't have pointers? Dictionaries with dictionaries within
> dictionaries... (with each letter as the key and the its
> children as values) is going to be extremely space inefficient,
> right?
Unfortunately, I don't know the space tradeoffs in Python
offhand. Lists and tuples make excellent trees.
The above might be stored as follows:
Every node is a tuple of its letter, a list of its children, and
a list of its words. So the two 'pelin' nodes would be (with 'e'
referenced in the 'h' node):
('h', [('e', [], ['pelin'])], ['pelin'])
That would in turn be "stored" in the t, n, i and s nodes.
('s',
[('i',
[('n',
[('t',
[('h', [('e', [], ['pelin'])], ['pelin'])
[])]
[])]
[]), ('o' trie (thanks Terry) omitted for my sanity)])
It's a lot harder to write by hand than it would be to use.
My intuition says it shouldn't be terribly hard on resources for
for a 180K dictionary, but I could be wrong. I'm too lazy to
measure. ;)
If it does turn out to be unreasonably huge, then you'd fall back
on solving the problem completely with a binary search returning
a range (I'm not sure of the name), which would be more expensive
at run time, but might be fast enough, and would use a minimal
amount of 'resources'.
--
Neil Cerutti
More information about the Python-list
mailing list