searching algorithm
Michael Bentley
michael at jedimindworks.com
Fri May 11 04:50:14 EDT 2007
On May 10, 2007, at 12:26 PM, Gigs_ wrote:
> Hi all!
>
> I have text file (english-croatian dictionary) with words in it in
> alphabetical
> order.
> This file contains 179999 words in this format:
> english word: croatian word
>
> I want to make instant search for my gui
> Instant search, i mean that my program search words and show words
> to user as
> user type letters.
> yes, it needs to be fast
>
> Can someone give me some points (approaches) how to make this
> Should i make indexes and go with file.seek
>
> or should breake dictionary in peaces and put words that start with
> a in one and
> with b in another...
> ?????
>
> So if i have this words
> ...
>
> if user type: "abs" program should list all words above in english
> and in croatian
> if user type: "absorb" than program should list last 3 words in
> english and in
> croatian
>
>
>
>
>
> any help would be appreciate!
> my apologies for bad english
Here's an idea: use a rats' nest of dictionaries and do all the
lookup work up front when you build the rats' nest. Maybe something
like this:
#! /usr/bin/env python
import pprint
dictionary = """absinth:pelin
absinthe:pelin
absolute:apsolutan
absolute:apsolutni kod
absolute:apsolutno
absolute:čist
absolute:nesumnjiv
absolute:potpun
absolute:savrsen
absolute coordinates:apsolutne koordinate
absolute frequency:apsolutna učestalost
absolute gap:apsolutni jaz
absolute line spacing:apsolutni međurazmak linija
absolute majority:apsolutna većina
absolute pointing device:apsolutni pokazivački uređaj
absolute quantity:apsolutni udio
absolute value:apsolutna vrijednost
absolute zero:apsolutna nula
absolutely:apsolutno
absolutely:bezuvjetno
absolutely:nezavisno
absolutely:potpuno
absolutely:samostalno
absolutely:sasvim
absolution:odrjesenje
absolution:oprostaj
absolutism:apsolutizam
absolve:odrijesiti
absolve:osloboditi
absorb:absorbirati
absorb:apsorbirati
absorb:crpsti"""
lookup = {'words':{}, 'letters':{}}
for translation in dictionary.split('\n'):
english, croatian = translation.split(':')
if english in lookup['words']:
lookup['words'][english].append(croatian)
else:
lookup['words'][english] = [croatian]
for position, letter in enumerate(english):
if position == 0:
youAreHere = lookup['letters']
if letter not in youAreHere:
youAreHere[letter] = {'words':[]}
youAreHere[letter]['words'].append(lookup['words'][english])
youAreHere = youAreHere[letter]
def tryit(partial):
youAreHere = lookup['letters']
for letter in partial:
youAreHere = youAreHere[letter]
return youAreHere['words']
if __name__ == '__main__':
pprint.pprint(tryit('abso'))
Hope this helps,
Michael
---
The Rules of Optimization are simple.
Rule 1: Don't do it.
Rule 2 (for experts only): Don't do it yet.
-- Michael A. Jackson , "Principles of
Program Design", 1975.
More information about the Python-list
mailing list