simple recursion help

Alex Martelli aleaxit at yahoo.com
Sat Oct 23 18:24:15 EDT 2004


Michael J. Fromberger <Michael.J.Fromberger at Clothing.Dartmouth.EDU>
wrote:
   ...
> > > I am looking for a list of all unique strings of length x whose
> > > elements are from the set a, b, and c.
   ...
> The task you're describing is generation of all strings over a given 
> alphabet.  Fortunately, there is a fairly simple technique for doing 
> this -- here is a Python generator to do it:
> 
> def lexgen(alph, maxlen = None):
>     """Generate all the possible strings over the given alphabet in 
>     lexicographic order.  Stop after generating the strings of maxlen,
>     if provided; otherwise, generate forever."""
>     
>     ixs = []
>     
>     while maxlen is None or len(ixs) <= maxlen:
>         while True:
>             yield str.join('', [ alph[ixs[x]] for x in
>                                  xrange(len(ixs) - 1, -1, -1) ])
> 
>             for pos in xrange(len(ixs)):
>                 ixs[pos] = (ixs[pos] + 1) % len(alph)
>                 if ixs[pos] <> 0:
>                     break
> 
>             if sum(ixs) == 0:
>                 break
> 
>         ixs += [0]

Nice, and different from what was offered in the other recent thread
(and from what the OP asked for) since you generate all strings of
length up to maxlen, while the request was for just those of length
exactly x.  Still, this can easily be restricted, and maybe a few
Pythonic tweaks with it...:

def lexgen_n(alph, x):
    ixs = [0] * x
    while True:
        yield ''.join([alph[i] for i in ixs[::-1]])
        for pos in xrange(x):
            ixs[pos] += 1
            if ixs[pos] < len(alph):
                break
            ixs[pos] = 0
        else:
            break

the 'else: break' at the end executes if the for terminates normally
(this makes it needless to test sum(ixs), or max(ixs), again).

In 2.4 one can do a bit better for the yield, with

        yield ''.join(alph[i] for i in reversed(ixs))

[generator expression vs list comprehension, and reversed built-in vs
reversal by slicing].  Of course, instead of reversing in the yield, one
can reverse in the for -- so in 2.4 one might have (variant...):

        yield ''.join(alph[i] for i in ixs)
        for pos in reversed(xrange(x)):
           ...

or, after a 'from itertools import imap':

        yield ''.join(imap(alph.__getitem__, ixs))

It's important to remember that Python does no constant-expression
hoisting; there may be important efficiencies in hoisting constants
oneself (that len(alph) in the inner loop, for example).  E.g, sticking
to 2.4 (what one should use if one cares for speed anyway), another
variant:

def lexgen_n(alph, x):
    # hoistings for speed
    from itertools import imap
    getalph = alph.__getitem__
    lenalph_m1 = len(alph) - 1

    # and now, to work
    ixs = [0] * x
    while True:
        yield ''.join(imap(getalph, reversed(ixs)))
        for pos, ix in enumerate(ixs):
            if ix == lenalph_m1:
                ixs[pos] = 0
            else:
                ixs[pos] = ix + 1
                break
        else:
            break


Alex



More information about the Python-list mailing list