Creating combination of sequences

Andrew Dalke adalke at mindspring.com
Sat Nov 13 17:17:48 EST 2004


Minho Chae wrote:
> I'm trying to create combinations of sequences.
> 
> For example, if the sequence is 'acgt' and the length is 8,
> then I would like to have 4^8 results such as
> 'aaaaaaaa', 'aaaaaaac', 'aaaaaaag', 'aaaaaaat', ... 'tttttttt'

Given the DNA base characters you use for your test case, do you
know about biopython.org?

Here is a solution to your problem.

def cycle(chars, length):
     """Generate 'length' copies of chars[0], then length of chars[1]
     then of chars[2], etc.  When chars is exhausted, start again.
     """
     counter = xrange(length)
     while 1:
         for c in chars:
             for _ in counter:
                 yield c

def once(chars, length):
     """Generate 'length' copies of chars[0], then of chars[1], etc.
     then of chars[-1].  When chars is exhausted, stop.
     """
     counter = xrange(length)
     for c in chars:
         for _ in counter:
             yield c

def all_combinations(chars, size):
     """Generate all words combinations using items from chars as
     the characters for each position.  If chars is in lexiographic
     order then so is the list of generated words.

     >>> for word in all_combinations("01", 3):
     ...     print word
     ...
     000
     001
     010
     011
     100
     101
     110
     111
     """
     if size == 0:
         return

     N = len(chars)
     gens = [once(chars, N**(size-1))]
     for i in range(size-2, -1, -1):
         gens.append(cycle(chars, N**i))

     def next(gen):
         return gen.next()

     while 1:
         yield "".join(map(next, gens))

def test():
     text = "acgt"
     n = 8
     i = -1
     for i, term in enumerate(all_combinations(text, n)):
         print term
     print (i+1), "combinations found"
     if n:
         assert (i+1) == len(text) ** n, (i, text, n)
     else:
         assert i == -1, i

if __name__ == "__main__":
     test()


				Andrew
				dalke at dalkescientific.com



More information about the Python-list mailing list