word shifts

Sun May 4 02:35:05 EDT 2008

On May 4, 2:04 am, "Gabriel Genellina" <gagsl-... at yahoo.com.ar> wrote:
> En Sun, 04 May 2008 02:17:07 -0300, dave <squareswallo... at invalid.com> escribió:
>
>
>
> > Hello,
>
> > I made a function that takes a word list (one word per line, text file)
> > and searches for all the words in the list that are 'shifts' of
> > eachother.  'abc' shifted 1 is 'bcd'
>
> > Please take a look and tell me if this is a viable solution.
>
> >  def shift(word, amt):
> >    ans = ''
> >    for letter in word:
> >            ans = ans + chr((ord(letter) - ord('a') + amt) % 26 + ord('a'))
> >    return ans
>
> > def fileshift(x):
> >    fin = open(x)
> >    d = {}
> >    for line in fin:
> >            d[line.strip()] = [1]
> >            for i in range(1, 26):
> >                    ite = shift(line.strip(), i)
> >                    if ite in d:
> >                            print ite
>
> > Any tips/suggestions/critiques greatly appreciated.. I'm trying to
> > teach myself Python (and still beginning) and would love any helpful
> > info.
>
> First, looking at the code, you're evaluating line.strip() a lot of times; I'd avoid it.
> Looks like you're using a dictionary just for the keys - the set type is more adequate here (seehttp://docs.python.org/lib/types-set.html). In any case, I'd use a value like None instead of [1]
> But I'd use a different algorithm. Instead of generating all posible shifts for a given word, I'd substract the newly read word from each previous words (here, "substract two words" means substract the character code for corresponding positions, modulo 26). Shifted words, when substracted, have the same number on all positions.

A faster algorithm is to create a 'key' for each word, defined as the
tuple of ord differences (modulo 26) of consecutive characters. E.g.
the key of 'acf' is (2,3); 'c' is 2 positions after 'a' and 'f' 3
positions after 'c'. Shifted words (and only these) have the same key.
Here's a straightforward implementation that generates all the lists
of equally-shifted words:

from collections import defaultdict

def iter_shifted(words):
    key2shifted = defaultdict(list)
    for word in words:
        ords = map(ord,word)
        key = tuple((ords[i]-ords[i-1]) % 26
                    for i in xrange(1,len(ords)))
        key2shifted[key].append(word)
    for shifted in key2shifted.itervalues():
        if len(shifted) > 1:
            yield shifted

if __name__ == '__main__':
    words = 'abc bef jas cba cde zab azy hkl'.split()
    for shifted in iter_shifted(words):
        print shifted

# *** output ***
#['bef', 'hkl']
#['abc', 'cde', 'zab']
#['cba', 'azy']