Find word by given characters

Tue Nov 3 21:33:09 EST 2020

On 03/11/2020 22:14, Avi Gross wrote:
> I, too, have wondered what exactly the point was of the required
> functionality for SCRABBLE but note you can extend a current word so
> additional letters may be available in a game but only if they are an exact
> fit to put before, after, or in middle of your word. 
> 
> But this seems to be a fairly simple problem to solve unless I misunderstand
> it. Elegance aside, what would be wrong with this approach.
> 
> - Read a word at a time in a loop from the file of dictionary words
> (non-Python meaning of dictionary.) For each one do the following, perhaps
> using a function:
> 
> Break the current word into a list of individual letters.
> Loop over the letters you want and:
> 	If the letter is in the list, remove it and continue
> 	Else skip the current word as it is not a match.
> 
> At the end of each of the above loops, you only reached here if all the
> letters were found and removed. If the list is now empty, fine. If it has
> extra remaining letters, also fine by the requirements stated. Letters in
> the list multiple times are removed multiple times.
> 
> The above need not use  list of letters and can be done many other ways but
> seems conceptually simple. Each word is processed once. It can be converted
> to using a list comprehension or something similar by using "all" and so on.
> 
> Or am I missing something about other approaches being better or more
> efficient or ... And, yes, counting may have an advantage as the list does
> not need to be modified repeatedly but creating an object or three also has
> overhead.

[snip]

The Counter approach only requires iterating over the letters once to
construct the letters bag, then each word once to create the relevant
word bag. After that it's (at worst) a couple of lookups and a
comparison for each unique character in letters (for each word).

Your approach requires iteration over the words to create the lists of
characters. Then they are (potentially) iterated over multiple times
looking for the characters in letters. There's also the cost of removing
items from arbitrary positions in the lists. Also, it seems that the
character frequencies must be equal, and your approach only ensures that
the words contain at least the required numbers of characters.

In computational terms, if the first approach is something like O(n+m)
for n letters and words of length m, your algorithm is more like O(nm).
Not to say that it will be slower for all possible letters and
dictionaries, but probably for any realistic cases and a lot slower for
large enough dictionaries.

Duncan