Dictionary project

Michael Spencer mahs at telcopartners.com
Sat Mar 11 19:54:24 EST 2006


> brandon.mcginty at gmail.com wrote:
...
>> I'm working on a project for school (it's not homework; just for fun).
>> For it, I need to make a list of words, starting with 1 character in length,
>> up to 15 or so.
>> It would look like:
>>
>> A B C d E F G ... Z Aa Ab Ac Ad Ae Aaa Aab Aac
...
>> If there is adaptable code on the internet that I've missed, please let me
>> know, and I'll go on my marry way in search of it.


I wrote:
> Something to adapt:
> 
> alphabet = "abcd"
> def allwords(maxlength = 4):
>      def wordgen(outer):
>          for partial in outer:
>              yield partial
>              for letter in alphabet:
>                  yield partial+letter
>      gen = alphabet
>      for length in range(maxlength-1):
>          gen = wordgen(gen)
>      return gen

I thought that it would be fairly easy to modify this to the spec, but I was 
wrong.  I can't see a straightforward way to eliminate the duplicates.

So here's a different approach, which I think does meet the spec:

from itertools import tee
def allwords2(alphabet="abcd", maxlen = 4):
     def wordgen():
         for char in alphabet:
             yield char
         for partial in allwordstee[1]:
             if len(partial) == maxlen:
                 raise StopIteration
             for char in alphabet:
                 yield partial+char
     #tee creates two copies of the iterator:
     # one is returned to the caller
     # the other is fed back to the generator
     allwordstee = tee(wordgen())
     return allwordstee[0]

 >>> list(allwords2())
['a', 'b', 'c', 'd', 'aa', 'ab', 'ac', 'ad', 'ba', 'bb', 'bc', 'bd', 'ca', 'cb', 
'cc', 'cd', 'da', 'db', 'dc', 'dd', 'aaa', 'aab', 'aac', 'aad', 'aba', 'abb', 
'abc', 'abd', 'aca', 'acb', 'acc', 'acd', 'ada', 'adb', 'adc', 'add', 'baa', 
'bab', 'bac', 'bad', 'bba', 'bbb', 'bbc', 'bbd', 'bca', 'bcb', 'bcc', 'bcd', 
'bda', 'bdb', 'bdc', 'bdd', 'caa', 'cab', 'cac', 'cad', 'cba', 'cbb', 'cbc', 
'cbd', 'cca', 'ccb', 'ccc', 'ccd', 'cda', 'cdb', 'cdc', 'cdd', 'daa', 'dab', 
'dac', 'dad', 'dba', 'dbb', 'dbc', 'dbd', 'dca', 'dcb', 'dcc', 'dcd', 'dda', 
'ddb', 'ddc', 'ddd', 'aaaa', 'aaab', 'aaac', 'aaad', 'aaba', 'aabb', 'aabc', 
'aabd', 'aaca', 'aacb', 'aacc', 'aacd', 'aada', 'aadb', 'aadc', 'aadd', 'abaa', ...
'dbcb', 'dbcc', 'dbcd', 'dbda', 'dbdb', 'dbdc', 'dbdd', 'dcaa', 'dcab', 'dcac', 
'dcad', 'dcba', 'dcbb', 'dcbc', 'dcbd', 'dcca', 'dccb', 'dccc', 'dccd', 'dcda', 
'dcdb', 'dcdc', 'dcdd', 'ddaa', 'ddab', 'ddac', 'ddad', 'ddba', 'ddbb', 'ddbc', 
'ddbd', 'ddca', 'ddcb', 'ddcc', 'ddcd', 'ddda', 'dddb', 'dddc', 'dddd']
 >>>

HTH
Michael




More information about the Python-list mailing list